Otaku Square has used a few search engines throughout it's existence, going from a simple MySQL query, to ElasticSearch, to Algolia and now finally TypeSense.

In this article we'll go through why I chose for TypeSense, and how the technical implementation of this went.

The quest for a new search engine

The quest for a new search engine has been on ever since Algolia introduced it's new pricing scheme. The new pricing would throw our pricing from 25 euros to an amount in the hundreds of euros... yeah, miss me with that.

Another shortcoming that I have been coping with for a while in Algolia is the lack of a good sorting option, other than just replicating your entire index with different sorting options.

So you can imagine how excited I was when one of my co-workers introduced me to MeiliSearch which is basically everything Algolia is... but free and open source.

One shortcoming of MeiliSearch is however that it has no sorting option built in. But hey, we can host it ourselves so we can just replicate it however much we want right? Yes, there is no catch here.

Buttttt... a few weeks later (before I had gotten around to implementing MeiliSearch) that same co-worker introduced me to TypeSense. Which is basically MeiliSearch, but written in C++ and with built in sorting capabilities.

After some testing around, I spun up a new server and started playing around with it and was pretty stoked to find that it basically ticked all my boxes.

There are still some limitations to TypeSense, such as the fact that you can not exclude a taxon from your query, but hey it's on the roadmap and I'll find a way around it.

Implementing TypeSense

So we have our search server up and running, coffee has been tapped, and it's 3AM in the morning. Time to hack away at the codebase to convert it to TypeSense! Except, there is not a lot of hacking required at all.

While building the new store, I had the foresight to turn everything search related into interfaces which can be implemented however you like. So it's really just a matter of implementing those interfaces and swapping out a "Searcher" class.

The populating side

But first, we have to make sure our search engine has some data. For that, I made a Symfony Messenger message / handler pair in the product API. This looks a bit like this:

public function upsertProduct(Product $product, ?Channel $channel)
{
    $viewer = $this->productManager->getProductViewer($product);
    $viewer->setViewChannel($channel);

    $images = $viewer->getMediaFilteredByType('image');
    $thumb = null;

    if (count($images)) {
        $thumb = $images[0]->getThumbnail();
    }

    $taxons = $viewer->getTaxons();

    $document = [
        'id' => (string) $product->getId(),
        'name' => $viewer->getAttributeValueFromStringAsString('attribute-general-display-name'),
        'priceInclVat' => round($viewer->getPrice(true), 2),
        'priceExclVat' => round($viewer->getPrice(false), 2),
        'taxCategory' => $viewer->getTaxCategory(),
        'availability' => $viewer->getAvailabilityAsState(),
        'image' => $thumb ?? 'https://cdn.otakusquare.com/static/no-image.png',
        'url' => '/products/' . $viewer->getAttributeValueFromStringAsString('attribute-general-slug') . '/view',
        'ean' => (string) $viewer->getAttributeValueFromStringAsString('attribute-erp-ean-code') ?? '',
        'nsfw' => (bool) $viewer->getAttributeValueFromStringAsString('attribute-general-nsfw'),
        'categories' => $this->buildFacets('category', $taxons),
        'manufacturers' => $this->buildFacets('manufacturer', $taxons),
        'franchises' => $this->buildFacets('franchise', $taxons),
        'packaging' => $this->buildFacets('packaging', $taxons),
        'materials' => $this->buildFacets('material', $taxons),
        'themes' => $this->buildFacets('theme', $taxons),
        'timeCreated' => $product->getDateCreated()->getTimestamp(),
        'timeUpdated' => $product->getDateUpdated()->getTimestamp(),
    ];

    $this->typeSenseConnector->getCollections()[
        $this->typeSenseHelper->getIndexForChannel($channel, 'products')
    ]->documents->upsert($document);
}

So now we have that, I just go ahead and dispatch the message whenever a product is created and updated. For good measure I also added a sanity command, which basically dispatches the message for every product present.

As you'll probably note, the populator also supports upserting to multiple channels. The indexes for a channel are defined in a simple YAML file:

typesense_indexes:
    otakusquare-com:
        products: products_211220200045

The searching side

The shop side of the search implementation is pretty cookie cutter really. All I need to do is implement a few interfaces:

  • QueryInterface: contains all query information
  • ResultInterface: represents a single result
  • ResultSetInterface: represents a collection of results
  • SearcherInterface: the business side, handles actual search logic

The query, result and resultset are pretty straight forward and are basically just generic classes.

The searcher handles all actual logic, and moulds this information into something the search engine can handle, and vice versa.

For TypeSense, this moulding looks a bit like this:

public function runQuery(ProductQueryInterface $query): ProductResultSet
{
    $q = $query->getQuery();

    $params = [
        'q' => $q,
        'query_by' => 'name,manufacturers,franchises,ean',
        'per_page' => $query->getPerPage(),
        'facet_by' => implode(',', [
            'themes',
            'availability',
            'categories',
            'franchises',
            'manufacturers'
        ]),
        'sort_by' => $query->getSort()['field'] . ':' . $query->getSort()['order'],
        'filter_by' => implode(' && ', $query->buildFacets()),
        'page' => $query->getPage(),
        'facet_query' => implode(' && ', $query->buildFacets()),
        'max_facet_values' => 1000,
    ];

    $results = $this->typeSenseConnector->getCollections()['products_otakusquare-com']->documents->search($params);

    return new ProductResultSet($results);
}

Now, if you're paying attention you'll see that the index name here differs from the index defined in the YAML file earlier.

This is because I opted to create an alias for my indexes, so I can easily run a re-index in the background if I screw up while keeping the last working index available to customers.

Populating the index

So we've got our populating code ready, our searching code ready, now we just need to populate the index and we're good to go!

However, we kind off have to insert around 16.000 documents... which takes a while seeing the kind of moulding we need to do to the data.

But, we have our message consumer running in supervisor... and consuming 4 messages at a time? Nah my friend, those are rookie numbers. You need to push those numbers up like there's no tomorrow.

So uh... I spun up 40 message consumers.

PROCESSING POWEEEEEEEER

The moment of truth

With all this implemented and the search engine populated it's time to hit the deploy button, cross our fingers and ask myself why I decided to do this at 6AM in the morning as I enter a search query thinking please pleeeaaaasssseee just work.

Yeah, yeah it does.

Demo?

The search engine is currently live and in use at otakusquare.com, so feel free to take a look!