Runaway Update, AI POC Model for Feature Extraction

The Turks have completed 10,000 ad transcriptions, I have added these to our database and assigned them event IDs. I am hopeful that we will be 100% done with the remaining ads by tomorrow.

I have also been working on an AI model for feature extraction. My first version of the model I picked the features: Reward, Name, Age and Skin Tone. The job of the AI is to pull out of the text the parts that match those categories so that the data can be further analyzed (for example the skin tone extraction will need to be run through a secondary model that groups into the three categories we discussed).

I’m pretty happy with the results for it being a first attempt, here are some examples:

You can see the model made one mistake (Name: faced) but got everything else right including the complicated ad listing multiple runaways, and this is with a very small amount of training data. I’m pretty confident that with some more training data and a little more work hyperparameter tuning the model accuracy will rise to useable levels. That said I think we will probably want an MTurk review of whatever results we come up with, it will still be much cheaper, faster and accurate than a purely MTurk solution.

I think we need to decide exactly what our features are going to be so I can MTurk some more training data. I can always add more features later but the training set will have to be updated to include whatever we add. Here is a list of features I think we should include, please let me know what to add/change:

-Name

-Age

-Skin Tone (three categories: light, average, dark)

-Reward Offered

-Secondary rewards offered

-Time of year Runaway event occurred

-Runaway location origin

Thanks,

Eric Anderson, CFA

Quantitative Analyst

Skylar Capital Energy Global Master Fund LP

(713) 341-7985 work

(281) 606-9371 cell