Fabrice Canel from Microsoft said that each day Bing discovers 12s of billions of normalized URLs never seen before. That is a lot of new URLs for BingBot to find in a single day, don’t you think?
But the web is big and content is constantly being produced, not just quality content but a lot of junk, gibberish, machine-generated content, and so forth.
Fabrice explained on Twitter that most of the content is “mostly useless content,” he listed examples such as duplicate content, scraped content, automatically generated content, spam content, junk content, and more.
So while Bing may discover billions and billions of new URLs per day, I doubt it indexes much of it.
Here are those tweets:
Site of the internet = ♾. We discover at #bing daily 12s of billions of normalized URLs never seen before. Mostly useless content (duplicate/scraped/automatically generated content, spam, junk, etc.). See our guidelines https://t.co/IKdDkLNs6W including the “Things to avoid”
— Fabrice Canel (@facan) August 17, 2022
Forum discussion at Twitter.