Archive for April, 2023

Interesting Issues with Clearbit

Monday, April 24th, 2023

Alarm icon

This weekend we had an interesting issue with Clearbit's Logo Search interface - a free service they provide on their Business Information Service system. You can basically hit their endpoint with a query param of the name of a Company, and they will respond with something that looks like:

  {
    name: 'Flexbase',
    domain: 'flexbase.app',
    logo: 'https://logo.clearbit.com/flexbase.app'
  }

which is a nice little thumbnail logos of the Company. It's a very nice tool, and for the most part works flawlessly.

Until it doesn't.

The real issue was the Open Source Node client that was hitting the company's endpoint. It started with:

  topSuggestion(name){
    return new Promise((resolve, reject) => {
      resolve(getTopSuggestion(name));
    });
  }

which called:

  let getTopSuggestion = (query) => {
    return new Promise((resolve, reject) => {
      request(config.api.autocomplete + '?query=' + query, (err, response, body) => {
        resolve(JSON.parse(body)[0]);
      });
    });
  }

Now when everything is working as it should, this code is just fine. But on the weekend, the response from the endpoint in getTopSuggestion() was returning:

  Welcome to the Company API. For docs see https://clearbit.com/docs#company-api

which, of course, isn't JSON, and so the JSON.parse() was throwing an exception. But the function getTopSuggestion() was using the resolve() for the Promise, so it was creating an exception that could not be caught. This was bad news.

Now as it turned out, a coworker found that Clearbit was doing some maintenance, and that might have been the genesis of the issue, but it was made much worse because when we debugged this on our machines - several of us, the issue didn't present itself. Only in production.

Still, it was clear this wasn't the right code to use, and the library was 6 years old without an update, and the code was small. So a coworker suggested we just make the one call ourselves:

    let res = {}
    try {
      // Get the top URL Suggestion for a store name
      const url = new URL(config.api.autocomplete)
      url.searchParams.append('query', name)
      // ...now make the call, and parse the JSON payload
      const payload = await fetch(url).then(response => response.json())
      if (Array.isArray(payload) && payload.length > 0) {
        // ...pick off the top suggestion
        res = payload[0]
      }
    } catch (err) {
      log.error(`[logoFinder] ...message... Name: '${name}', Details: "${err.message}"`)
      return {
        success: false,
        error: errorMessages.badClearBitRequest,
        exception: err,
      }
    }
    return {
      success: true,
      ...res,
    }
  }

where the error message is really up to you, but the point was that this was something that would handle the simple text being returned by the endpoint and throw the exception on the JSON parsing without causing all the trouble of the library we were using.

There were a few things I liked about the new implementation we came up with:

  • Explicitly setting the query param on the URL - while it's possible that 90% of all name values would not lead to an issue, it's always nice to be safe and make sure that the proper encodings are done with the query params. It's two lines of code, but it makes sure that it's all handled properly.
  • The chaining of fetch() and then() - both fetch() and response.json() are async functions, so you might expect to see two await prependers on the functions, but there's only one. This is a nice feature of the then(), in that it unrolls the async nature of the fetch() so that the async nature of the .json() comes through - returning the value to the caller.

Sure, we still need to get the first element in the Array, but we also test that to make sure it's actually an array, and that there's something to get. It's just a lot more defensive coding than the original client had, and when we did this, we still got the good results on the dev machines, and at the same time, we got proper exception catching on the production instances.

Thankfully, the issues resided about the time we got the fix into the code, tested, and into production, so it wasn't long-lived, but it was a problem for a while, and we were able to recover the errors due to queues and retries, which is another saving grace that I was very thankful for.

Nothing like a little production outage to make the day exciting. 🙂