The issue is Vercel Image API is ridiculously expensive and also not efficient.
I would recommend using Thumbor instead: https://thumbor.readthedocs.io/en/latest/. You could have ChatGPT write up a React image wrapper pretty quickly for this.
I once sat down to calculate the costs of my app if it ever went viral being hosted at vercel. That has put me off on hosting anything on vercel ever or even touching NextJS. It feels like total vendor lock in once you have something running there, and you're kind of end up paying them 10x more than if you had taken the extra time to deploy it yourself.
$5 to resize 1,000 images is ridiculously expensive.
At my last job we resized a very large amount of images every day, and did so for significantly cheaper (a fraction of a cent for a thousand images).
Am I missing something here?
As someone who maintains a Music+Podcast app as a hobby project, I intentionally have no servers for it.
You don't need one. You can fetch RSS feeds directly on mobile devices; it is faster, less work to maintain, and has a smaller attach surface for rouge bots.
Death by stupid micro services. Even at 1.5 mil pages, and the traffic they are talking about this could easily be hosted on a a fixed $80/month linode.
The cost of getting locked into Vercel.
Yeah, AI crawlers - add that to my list of phobias. Though for a bootstrapped startup why not look to cut all recurrent expenses and just deploy imagemagik that I am sure will do the trick for less.
Wow this is interesting. I launched my site like a week ago, only submitted to google. But all the crawlers (especially the SEO bots) mentioned in the article were heavily crawling it in a few days.
Interestingly, openai crawler visited over a 1000 times, many of them for "ChatGPT-User/1.0" which is supposed to be for when a user searches chatgpt. Not a single referred visitor though. Makes me wonder if it's any beneficial to the content publishers to allow bot crawls
I ended up banning every SEO bot in robots.txt and a bunch of other bots
Vercel has a fairly generous free quota and a non-negligible high pricing scheme - I think people still remember https://service-markup.vercel.app/ .
For the crawl problem, I want to wait and see whether robots.txt is proved enough to stop GenAI bots from crawling since I confidently believe these GenAI companies are too "well-behaved" to respect robots.txt.
> Optimizing an image meant that Next.js downloaded the image from one of those hosts to Vercel first, optimized it, then served to the users.
So Metacast generate bot traffic on other websites, presumably to "borrow" their content and serve it to their own users, but they don't like it when others do the same to them.
Another story for https://serverlesshorrors.com/
It's crazy how these companies are really fleecing their customers who don't know any better. Is there even a way to tell Vercel: "I only want to spend $10 a month max on this project, CUT ME OFF if I go past it."? This is crazy.
I spend $12 a month on BunnyCDN. And $9 a month on BunnyCDN's image optimizer that allows me to add HTTP params to the url to modify images.
1.33TB of CDN traffic. (ps: can't say enough good things about bunnycdn, such a cool company, does exactly what you pay for nothing more nothing less)
This is nuts dude
A single $5 vps should be able to handle easily tens of thousands of requests...
Not that much for simple thumbnails in addition. So sad that the trend of "fullstack" engineers being just frontend js/ts devs took off with thousands of companies having no clue at all about how to serve websites, backends and server engineering...
Don’t feed the bots. Why a pixel image? Take an svg and make it pulse while playing.
Is there no CDN? This feels like it's a non-issue if there's a CDN.
I guess it goes to show how jaded I am, but as I was reading this, it felt like an ad for Vercel. I'm so sick of marketing content being submitted as actual content, that when I read a potentially actual blog/post-mortem, my spidey senses get all tingly about potential advertising. However, I feel like if I turn down the sensitivity knob, I'll be worse off than knee jerk thinking things like this are ads.
$5 for 1,000 image optimizations? Is Vercel not caching the optimization? Why would it be doing more than one per-image on a fresh deploy?
"Step 3: robots.txt"
Will do nothing to mitigate the problem. As is well known, these bots don't respect it.
It’s a shame that the knee-jerk reaction has been to outright block these bots. I think in the future, websites will learn to serve pure markdown to these bots instead of blocking. That way, websites prevent bandwidth overages like in the article, while still informing LLMs about the services their website provides.
[disclaimer: I run https://pure.md, which helps websites shield from this traffic]
"Together they sent 66.5k requests to our site within a single day."
Only scriptkiddies are getting into problems by such low numbers. Im sure security is your next 'misconfiguration'. Better search an offline job in the entertainment industries.
(I work at Vercel) While it's good our spend limits worked, it clearly was not obvious how to block or challenge AI crawlers¹ from our firewall (which it seems you manually found). We'll surface this better in the UI, and also have more bot protection features coming soon. Also glad our improved image optimization pricing² would have helped. Open to other feedback as well, thanks for sharing.
Âą: https://vercel.com/templates/vercel-firewall/block-ai-bots-f...
²: https://vercel.com/changelog/faster-transformations-and-redu...