Shop products¶

Many pages on the Tate website display 'cards' for products from Tate's online shop. The product details are obtained from the Salesforce API.

Shop API¶

The TateShopAPI class defined in tate/shop/shop_api.py serves as a Python client for accessing the two relevant endpoints of the Salesforce API: /products and /product_search.

For full details of how these endpoints behave, refer to the Salesforce API documentation:

Initialising TateShopAPI¶

A working instance of TateShopAPI must always be initiated with correct values for the api_host and client_id. These should be defined in the settings applied in the production application.

It is important that client_id be treated as a secret and not committed to this or any other repository in plain text.

api_host is the URL of the API that prefixes the specific endpoint being queried. The final segment of this URL specifies the version number (in the format YY_M). There is a case to be made for treating this version number as a separate setting to the base URL of the API.

Using TateShopAPI¶

TateShopAPI supports two public methods: .products() and .product_search(), corresponding to the Salesforce endpoints with the same names.

TateShopAPI.products() retrieves the details of products with known product IDs. These IDs should be passed as a list of strings and/or integers in the required ids argument.

TateShopAPI.product_search() retrieves products that either match a product category (using a filtered search based on category ID) or a keyword query or a combination of the two.

Keyword query pitfalls¶

Keyword queries for products in the Tate online shop are prone to returning unwanted and irrelevant results thanks to the following factors:

Tate's product catalogue contains a mixture of event tickets and genuine shop products. There is no reliable way to exclude tickets from search results.
The /product_search endpoint of the Salesforce API seems to prioritise retrieval over precision. When multiple terms appear in the query, search results will not necessarily contain every term. Matches will also be returned when query terms are found in the product description, which may be several paragraphs long.

Nevertheless, a keyword search to the /product_search endpoint offers our best hope of obtaining a set of products relevant to an artist or artwork, to render at the end of a page in the website's Art & Artists section. Hence we need to overcome these drawbacks as best we can.

We also need to allow for discrepancies between how artist names and artwork titles appear in the collection database vs. how these same topics are named in the shop's product catalogue. If we generate keyword queries based on the raw data drawn from the collection database, we run the risk of finding misleading matches (because our query contained widely-used terms) or failing to find any matching results (because our query passed in too many terms from the collection database).

We try to mitigate these issues via a combination of measures, applied before and after the API query is executed.

Before sending a keyword query to the API, the website frontend uses the helper methods in tate/shop/tate_shop.py to construct the optimal keyword query for products relating to either an artist or artwork, using data obtained from the collection database. (The comments placed in context in this file attempt to explain what's going on here.)
Once search results have been obtained, TateShopAPI.product_search() performs a sanity check on each product in the results list. Products are removed from the set if none of the original query terms is present in the product title (i.e. results that only matched by virtue of product description), if there is no image of the product, or if the image does not have the path we expect for a true product image (these latter checks should allow us to exclude any event tickets from the results we return to the frontend context).