iMessage User Agent firing crawler twice

129 views Asked by At

I've been building an express server to filter our crawlers from normal users to display very specific meta information. I've got a very large crawler list (JSON) that I use pattern matching on to determine if the user agent is a bot or not.

After testing I've got nearly all platforms set up, but iMessage is throwing me off. When I paste a url into iMessage and press send in my logs I see two different user agents appearing

com.apple.WebKit.Networking/8615.3.12.10.2 CFNetwork/1410.0.3 Darwin/22.6.0

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0

How can I implement this user agent into my master list while ensuring I'm not going to server crawler info to normal users and visa-versa?

Here's how I'm detecting crawlers:

const isCrawler = (userAgent) => {
    return crawlers.some(crawler => new RegExp(crawler, 'i').test(userAgent));
};

and here's an example of one of my many items

{
      "pattern": "Applebot",
      "url": "http://www.apple.com/go/applebot",
      "addition_date": "2015/04/15",
      "instances": [
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Applebot/0.1)",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Applebot/0.1; +http://www.apple.com/go/applebot)",
        "Mozilla/5.0 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)",
        "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)",
        "Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B410 Safari/600.1.4 (Applebot/0.1; +http://www.apple.com/go/applebot)"
      ]
    }
    ,

tldr; when I paste a link in iMessage I see a few requests to my express server, with two different user-agents. neither are triggering the correct meta information to be served.

edit it's probably working when the first request comes in, but the second user-agent is probably the issue. I guess my question is could I just also look for something like:com.apple.WebKit.Networking/8615.3.12.10.2 CFNetwork/1410.0.3 Darwin/22.6.0 and if it's found can I assume this is just a crawler, and not a human.

0

There are 0 answers