Batch updating plugins allows your plugin to update/set one or more user properties in the current project's data in large batches. An example would be a plugin that predicts the likelihood that a user will convert, and predicts this for every user in the current project.
For each user it saves a number from 0.0 to 1.0, based on how likely it is that this user will convert. This property can then be re-used everywhere on the platform, such as segments, report filters and more.
Configuring plugin as batch updating
To make your plugin support batch updating, one of the plugin JSON result stages needs to output an object with key
batches on the root object. This can be added to the output JSON from the initial stage or any additional stage. If multiple stages output an
batches results object, the one from the last stage will be taken.
Here is a basic
batches object specification that is added on the
As can be seen in the example above, the
batches object is pretty straightforward:
- maxBatchSize: How many user objects to score in a single batch? Defaults to 10,000, minimum needs to be 1000 and maximum 10,000,000.
- options: Optional — This object will be passed on to the manifest of the batch stage. This way you can easily share parameters from for example the model training stage with the batch stage.
Batch updating plugin manifest
When a plugin supports batch updating, it again receives a JSON manifest for the plugin to read, just like in any other stage such as the initial one.
During local development, get your batch stage manifest by doing a GET like this — note that the last path for the stage stage needs to be
Store the results in a file called
batch-manifest.json, as we'll use it in the next step to the launch plugin's batch process.
Here is what the JSON manifest will look like:
First note that the
stage will always be
batch for the batch updating part of plugins. Basically when the plugin loads the JSON manifest, and it finds that
manifest.stage == 'batch' we simply start the batch scoring part of our plugin code.
Data urls are available under
dataUrls, because we need to get a batch of say 1000 user records with their features and score each of them. Any of the datasets requested in the initial or additional stages are available here. Note that each data url has
range_end_lt query parameters appended automatically — this so that we get roughly the number of users per batch as we've requested during earlier stages with the
To update the actual user properties, we need to upload a JSON file with the updated properties for each user. This is done via the
batch property, more on this later.
You can download files uploaded to storage from previous stages using the
downloadUrls, and append the file path used during uploading, for example
options object is an exact copy of what was specified in the previous stage under the
batches object. The
metadata is the exact same object as for any previous stages.
Batch updating user properties
A batch updating plugin should generate two files.
results.json which has no
data or other properties like it has for regular plugin runs, but only has the
status object like below:
In case any of the batches doesn't have a
success code, the plugin run as a whole will fail.
The other file is a
data.json file, which contains the actual users and property values that should be updated/set:
As can be seen in the example above the
data.json file is pretty straightforward:
- category: Optional, but recommended to specify this. Each time this plugins runs, the properties will be saved under the name of the plugin plus the timestamp of the run. If
categoryis given those properties will end up under its own sub-menu in the user properties menu. If not used, the properties will appear at root of the user properties.
- properties: An array of strings. Each element corresponds with the name of the user property being set or updated. Needs to have a minimum of one element.
- updates: An array of arrays. Each element starts with the
user_id(see Dataset and features), then each element after that corresponds with the values as specified in the
propertiesarray. So in this example the values
"A", "A", "B", "C"are for to the
0.43, 0.59, 0.37, 0.01are for the
"Rating"property. The first element thus updates
Classfor that user to
Running the plugin
Once you're ready to test your batch updating plugin, start it with the command below, where
batch-manifest.json was generated in the previous step:
Once the run finished, you should have a
Your code should automatically upload the
data.json file using the
batch key of
getUploadUrls. So you'll have to add that part before your plugin is fully completed. Basically you first need to get a signed upload url, by doing a
getUploadUrls for the
batch key, and append to the
/data.json. Then make a
PUT request on that signed upload url as-is, sending the
data.json file. Here's the flow using curl:
You can either implement this using http request libraries for your language, or just use system exec and
curl from your plugin code, as curl is available in your plugin runner environment.