Did you know that the first day of Cloud Retail Search live on a website can return the worst results? While this may sound surprising or disappointing, the reality is that training complex AI models takes time. It also means the only direction is up!
Even if the results are dramatically better than your prior search solution, it’s nice to know what additional improvements you can make and understand how the models will improve over time with additional data. It can be tempting to massage the results by creating excessive Serving Controls, but this can lead to model overfitting. To accelerate model training and ensure your site is set up for long-term success while following best practices, try these four tips:
Import User Events
Many retailers spend most of their time focusing on product catalog data preparation and ingestion while ignoring user event ingestion. Some retailers never ingest user events at all, and by doing so they’re missing out on a crucial machine learning model training input.
Importing historical user events, if you have them available, can also be very valuable to accelerate model training and improved results. You will also find that Recommendations AI models all have varying levels of event data input requirements and more model opportunities will open up to you the longer the Retail API is live on your website.
Ensure Data Consistency
A common example of inconsistency in data is in attribute data. For example, a brand name is used differently throughout your catalog, then products will be missing when a customer tries to filter or sort by brand.
As an example, if a brand is named, “Fancy Pants Jeans,” there may be instances of “Fancy Pants,” “FPJ” or “Fancy Pants Jeans, Inc.” in your catalog data. Proper preparation of your upstream data prior to data ingestion creates accurate results.
Check Attribute Control Settings
Checking the “Retrievable” settings on an attribute is one of the first settings to check if your search results page is missing attributes. The setting should be set to True to return the attribute in response to search queries. Without it, only the product name will be returned. A maximum of 30 attributes can be set to retrievable.
Following Retrievable, “Indexable” and “Searchable” should both be True on important attributes. “Indexable” allows Cloud Retail Search to filter and facet using this attribute, while “Searchable” improves recall and sensitivity.
Warm Up Models in Dark Mode
If you haven’t gone live yet and want to improve your results prior to cutting over, consider warming up the Retail API using “dark mode.” This entails sending queries to Cloud Retail Search and Recommendations AI but not serving the results back to customers. The amount of time it takes to train your model largely depends on search volume, but two to three weeks is a good rule of thumb.
Warming up models only functions correctly if you send user event data. Without user event data, the models will have no way to know what products converted from search result pages.
These four methods are common ways to improve search results returned by Google’s Retail API. Our biggest piece of advice is to let the model do the thinking. It may be tempting to add Serving Controls to create specific behavior, but usually the AI solution can train itself without human input. By creating too many guardrails too early, you may be missing out on better results.
As a gentle reminder, changes to some Attribute Controls can take up to two days to take effect. Google indexes a lot of data, and your product catalog enters a large queue of indexing jobs when you update settings. Return to the Evaluate page in the Google Cloud Retail console to check in on the progress periodically.