\n \n In the midst of the hottest summer on record, Cloudflare held its first ever Impact Week. We announced a variety of products and initiatives that aim to make the Internet and our planet a better place, with a focus on environmental, social, and governance projects. Today, we’re excited to share an update on Crawler Hints, an initiative announced during Impact Week. Crawler Hints is a service that improves the operating efficiency of approximately 45% of the Internet traffic that comes from web crawlers and bots.
Crawler Hints achieves this efficiency improvement by ensuring that crawlers get information about what they’ve crawled previously and if it makes sense to crawl a website again.
Today we are excited to announce two updates for Crawler Hints:
The first: Crawler Hints now supports IndexNow, a new protocol that allows websites to notify search engines whenever content on their website content is created, updated, or deleted. By collaborating with Microsoft and Yandex, Cloudflare can help improve the efficiency of their search engine infrastructure, customer origin servers, and the Internet at large.
The second: Crawler Hints is now generally available to all Cloudflare customers for free. Customers can benefit from these more efficient crawls with a single button click. If you want to enable Crawler Hints, you can do so in the Cache Tab of the Dashboard.
\n \n
What problem does Crawler Hints solve?
\n
\n \n \n
\n Crawlers help make the Internet work. Crawlers are automated services that travel the Internet looking for… well, whatever they are programmed to look for. To power experiences that rely on indexing content from across the web, search engines and similar services operate massive networks of bots that crawl the Internet to identify the content most relevant to a user query. But because content on the web is always changing, and there is no central clearinghouse for when these changes happen on websites, search engine crawlers have a Sisyphean task. They must continuously wander the Internet, making guesses on how frequently they should check a given site for updates to its content.
Companies that run search engines have worked hard to make the process as efficient as possible, pushing the state-of-the-art for crawl cadence and infrastructure efficiency. But there remains one clear area of waste: excessive crawl.
At Cloudflare, we see traffic from all the major search crawlers, and have spent the last year studying how often these bots revisit a page that hasn't changed since they last saw it. Every one of these visits is a waste. And, unfortunately, our observation suggests that 53% of this crawler traffic is wasted.
With Crawler Hints, we expect to make this task a bit more tractable by providing an additional heuristic to the people who run these crawlers. This will allow them to know when content has been changed or added to a site instead of relying on preferences or previous changes that might not reflect the true change cadence for a site. Crawler Hints aims to increase the proportion of relevant crawls and limit crawls that don’t find fresh content, improving customer experience and reducing the need for repeated crawls.
Cloudflare sits in a unique position on the Internet to help give crawlers hints about when they should recrawl a site. Don’t knock on a website’s door every 30 seconds to see if anything is new when Cloudflare can proactively tell your crawler when it’s the best time to index new or changed content. That’s Crawler Hints in a nutshell!
If you want to learn more about Crawler Hints, see the original blog.
\n \n
What is IndexNow?
\n
\n \n \n
\n IndexNow is a standard that was written by Microsoft and Yandex search engines. The standard aims to provide an efficient manner of signaling to search engines and other crawlers for when they should crawl content. Cloudflare’s Crawler Hints now supports IndexNow.
In its simplest form, IndexNow is a simple ping so that search engines know that a URL and its content has been added, updated, or deleted, allowing search engines to quickly reflect this change in their search results.- www.indexnow.org
By enabling Crawler Hints on your website, with the simple click of a button, Cloudflare will take care of signaling to these search engines when your content has changed via the IndexNow protocol. You don’t need to do anything else!
What does this mean for search engine operators? With Crawler Hints you’ll receive a near real-time, pushed feed of change events of Cloudflare websites (that have opted in). This, in turn, will dramatically improve not just the quality of your results, but also the energy efficiency of running your bots.
\n \n
Collaborating with Industry leaders
\n
\n \n \n
\n Cloudflare is in a unique position to have a sizable portion of the Internet proxied behind us. As a result, we are able to see trends in the way bots access web resources. That visibility allows us to be proactive about signaling which crawls are required vs. not. We are excited to work with partners to make these insights useful to our customers. Search engines are key constituents in this equation. We are happy to collaborate and share this vision of a more efficient Internet with Microsoft Bing, and Yandex. We have been testing our interaction via IndexNow with Bing and Yandex for months with some early successes.
This is just the beginning. Crawler Hints is a continuous process that will require working with more and more partners to improve Internet efficiency more generally. While this may take time and participation from other key parts of the industry, we are open to collaborate with any interested participant who relies on crawling to power user experiences.
“The cache data from CDNs is a really valuable signal for content freshness. Cloudflare, as one of the top CDNs, is key in the adoption of IndexNow to become an industry-wide standard with a large portion of the internet actually using it. Cloudflare has built a really easy 1-click button for their users to start using it right away. Cloudflare’s mission of helping build a better Internet resonates well with why I started IndexNow i.e. to build a more efficient and effective Search.”- Fabrice Canel, Principal Program Manager
\n \n \n \n \n “Yandex is excited to join IndexNow as part of our long-term focus on sustainability. We have been working with the Cloudflare team in early testing to incorporate their caching signals in our crawling mechanism via the IndexNow API. The results are great so far.”- Maxim Zagrebin, Head of Yandex Search
\n \n \n \n \n "DuckDuckGo is supportive of anything that makes search more environmentally friendly and better for end users without harming privacy. We're looking forward to working with Cloudflare on this proposal."- Gabriel Weinberg, CEO and Founder
\n \n \n \n \n \n \n
How do Cloudflare customers benefit?
\n
\n \n \n
\n Crawler Hints doesn’t just benefit search engines. For our customers and origin owners, Crawler Hints will ensure that search engines and other bot-powered experiences will always have the freshest version of your content, translating into happier users and ultimately influencing search rankings. Crawler Hints will also mean less traffic hitting your origin, improving resource consumption. Moreover, your site performance will be improved as well: your human customers will not be competing with bots!
And for Internet users? When you interact with bot-fed experiences — which we all do every day, whether we realize it or not, like search engines or pricing tools — these will now deliver more useful results from crawled data, because Cloudflare has signaled to the owners of the bots the moment they need to update their results.
\n \n
How can I enable Crawler Hints for my website?
\n
\n \n \n
\n Crawler Hints is free to use for all Cloudflare customers and promises to revolutionize web efficiency. If you’d like to see how Crawler Hints can benefit how your website is indexed by the worlds biggest search engines, please feel free to opt-into the service:
Sign in to your Cloudflare Account.
In the dashboard, navigate to the Cache tab.
Click on the Configuration section.
Locate the Crawler Hints sign up card and enable. It's that easy.
\n \n \n \n \n Once you’ve enabled it, we will begin sending hints to search engines about when they should crawl particular parts of your website. Crawler Hints holds tremendous promise to improve the efficiency of the Internet.
\n \n
What’s next?
\n
\n \n \n
\n We’re thrilled to collaborate with industry leaders Microsoft Bing, and Yandex to bring IndexNow to Crawler Hints, and to bring Crawler Hints to a wide audience in general availability. We look forward to working with additional companies who run crawlers to help make this process more efficient for the whole Internet.
"],"published_at":[0,"2021-10-18T17:30:53.000+01:00"],"updated_at":[0,"2024-10-10T00:30:31.680Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/79gAkZOFr2Q3Gt82tlUf0C/c67b030d3188f562aa0be7fc4943235d/cloudflare-now-supports-indexnow.png"],"tags":[1,[[0,{"id":[0,"64PJo71EoH98xKmPCQw9aW"],"name":[0,"Crawler Hints"],"slug":[0,"crawler-hints"]}],[0,{"id":[0,"6QktrXeEFcl4e2dZUTZVGl"],"name":[0,"Product News"],"slug":[0,"product-news"]}],[0,{"id":[0,"48r7QV00gLMWOIcM1CSDRy"],"name":[0,"Speed & Reliability"],"slug":[0,"speed-and-reliability"]}],[0,{"id":[0,"4tupeo5aQFnOvZQycFFFWI"],"name":[0,"SEO"],"slug":[0,"seo"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Alex Krivit"],"slug":[0,"alex"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1e9gtWsffhoAWG0rTN45cR/35204e3d77d1a90feeb57a32e388daca/alex.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,"@ackriv"],"facebook":[0,null],"publiclyIndex":[0,true]}],[0,{"name":[0,"Abhi Das"],"slug":[0,"abhi"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1aSM2Cr0Slgy7gVytmzkgS/da3f1d3bb97f2d631b766e61c8b67a96/abhi.jpeg"],"location":[0,null],"website":[0,null],"twitter":[0,"@abhidasone"],"facebook":[0,null],"publiclyIndex":[0,true]}]]],"meta_description":[0,null],"primary_author":[0,{}],"localeList":[0,{"name":[0,"Crawler Hints Update: Cloudflare Supports IndexNow and Announces General Availability Config"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"Translated for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/cloudflare-now-supports-indexnow"],"metadata":[0,{"title":[0],"description":[0],"imgPreview":[0,""]}],"publicly_index":[0,true]}],"initialReadingTime":[0,"5"],"relatedPosts":[1,[[0,{"id":[0,"6JFB9cHUOoMZnVmYIuTLzd"],"title":[0,"R2 Data Catalog: Managed Apache Iceberg tables with zero egress fees"],"slug":[0,"r2-data-catalog-public-beta"],"excerpt":[0,"R2 Data Catalog is now in public beta: a managed Apache Iceberg data catalog built directly into your R2 bucket."],"featured":[0,false],"html":[0,"Apache Iceberg is quickly becoming the standard table format for querying large analytic datasets in object storage. We’re seeing this trend firsthand as more and more developers and data teams adopt Iceberg on Cloudflare R2. But until now, using Iceberg with R2 meant managing additional infrastructure or relying on external data catalogs.
So we’re fixing this. Today, we’re launching the R2 Data Catalog in open beta, a managed Apache Iceberg catalog built directly into your Cloudflare R2 bucket.
If you’re not already familiar with it, Iceberg is an open table format built for large-scale analytics on datasets stored in object storage. With R2 Data Catalog, you get the database-like capabilities Iceberg is known for – ACID transactions, schema evolution, and efficient querying – without the overhead of managing your own external catalog.
R2 Data Catalog exposes a standard Iceberg REST catalog interface, so you can connect the engines you already use, like PyIceberg, Snowflake, and Spark. And, as always with R2, there are no egress fees, meaning that no matter which cloud or region your data is consumed from, you won’t have to worry about growing data transfer costs.
Ready to query data in R2 right now? Jump into the developer docs and enable a data catalog on your R2 bucket in just a few clicks. Or keep reading to learn more about Iceberg, data catalogs, how metadata files work under the hood, and how to create your first Iceberg table.
\n \n
What is Apache Iceberg?
\n
\n \n \n
\n Apache Iceberg is an open table format for analyzing large datasets in object storage. It brings database-like features – ACID transactions, time travel, and schema evolution – to files stored in formats like Parquet or ORC.
Historically, data lakes were just collections of raw files in object storage. However, without a unified metadata layer, datasets could easily become corrupted, were difficult to evolve, and queries often required expensive full-table scans.
Iceberg solves these problems by:
Providing ACID transactions for reliable, concurrent reads and writes.
Maintaining optimized metadata, so engines can skip irrelevant files and avoid unnecessary full-table scans.
Supporting schema evolution, allowing columns to be added, renamed, or dropped without rewriting existing data.
Iceberg is already widely supported by engines like Apache Spark, Trino, Snowflake, DuckDB, and ClickHouse, with a fast-growing community behind it.
\n \n
How Iceberg tables are stored
\n
\n \n \n
\n \n \n \n Internally, an Iceberg table is a collection of data files (typically stored in columnar formats like Parquet or ORC) and metadata files (typically stored in JSON or Avro) that describe table snapshots, schemas, and partition layouts.
To understand how query engines interact efficiently with Iceberg tables, it helps to look at an Iceberg metadata file (simplified):
\n {\n "format-version": 2,\n "table-uuid": "0195e49b-8f7c-7933-8b43-d2902c72720a",\n "location": "s3://my-bucket/warehouse/0195e49b-79ca/table",\n "current-schema-id": 0,\n "schemas": [\n {\n "schema-id": 0,\n "type": "struct",\n "fields": [\n { "id": 1, "name": "id", "required": false, "type": "long" },\n { "id": 2, "name": "data", "required": false, "type": "string" }\n ]\n }\n ],\n "current-snapshot-id": 3567362634015106507,\n "snapshots": [\n {\n "snapshot-id": 3567362634015106507,\n "sequence-number": 1,\n "timestamp-ms": 1743297158403,\n "manifest-list": "s3://my-bucket/warehouse/0195e49b-79ca/table/metadata/snap-3567362634015106507-0.avro",\n "summary": {},\n "schema-id": 0\n }\n ],\n "partition-specs": [{ "spec-id": 0, "fields": [] }]\n}
\n A few of the important components are:
schemas
: Iceberg tracks schema changes over time. Engines use schema information to safely read and write data without needing to rewrite underlying files.
snapshots
: Each snapshot references a specific set of data files that represent the state of the table at a point in time. This enables features like time travel.
partition-specs
: These define how the table is logically partitioned. Query engines leverage this information during planning to skip unnecessary partitions, greatly improving query performance.
By reading Iceberg metadata, query engines can efficiently prune partitions, load only the relevant snapshots, and fetch only the data files it needs, resulting in faster queries.
\n \n
Why do you need a data catalog?
\n
\n \n \n
\n Although the Iceberg data and metadata files themselves live directly in object storage (like R2), the list of tables and pointers to the current metadata need to be tracked centrally by a data catalog.
Think of a data catalog as a library's index system. While books (your data) are physically distributed across shelves (object storage), the index provides a single source of truth about what books exist, their locations, and their latest editions. Without this index, readers (query engines) would waste time searching for books, might access outdated versions, or could accidentally shelve new books in ways that make them unfindable.
Similarly, data catalogs ensure consistent, coordinated access, allowing multiple query engines to safely read from and write to the same tables without conflicts or data corruption.
\n \n
Create your first Iceberg table on R2
\n
\n \n \n
\n Ready to try it out? Here’s a quick example using PyIceberg and Python to get you started. For a detailed step-by-step guide, check out our developer docs.
1. Enable R2 Data Catalog on your bucket:\n
\n npx wrangler r2 bucket catalog enable my-bucket
\n Or use the Cloudflare dashboard: Navigate to R2 Object Storage > Settings > R2 Data Catalog and click Enable.
2. Create a Cloudflare API token with permissions for both R2 storage and the data catalog.
3. Install PyIceberg and PyArrow, then open a Python shell or notebook:
\n pip install pyiceberg pyarrow
\n 4. Connect to the catalog and create a table:
\n import pyarrow as pa\nfrom pyiceberg.catalog.rest import RestCatalog\n\n# Define catalog connection details (replace variables)\nWAREHOUSE = "<WAREHOUSE>"\nTOKEN = "<TOKEN>"\nCATALOG_URI = "<CATALOG_URI>"\n\n# Connect to R2 Data Catalog\ncatalog = RestCatalog(\n name="my_catalog",\n warehouse=WAREHOUSE,\n uri=CATALOG_URI,\n token=TOKEN,\n)\n\n# Create default namespace\ncatalog.create_namespace("default")\n\n# Create simple PyArrow table\ndf = pa.table({\n "id": [1, 2, 3],\n "name": ["Alice", "Bob", "Charlie"],\n})\n\n# Create an Iceberg table\ntable = catalog.create_table(\n ("default", "my_table"),\n schema=df.schema,\n)
\n You can now append more data or run queries, just as you would with any Apache Iceberg table.
\n \n While R2 Data Catalog is in open beta, there will be no additional charges beyond standard R2 storage and operations costs incurred by query engines accessing data. Storage pricing for buckets with R2 Data Catalog enabled remains the same as standard R2 buckets – \\$0.015 per GB-month. As always, egress directly from R2 buckets remains \\$0.
In the future, we plan to introduce pricing for catalog operations (e.g., creating tables, retrieving table metadata, etc.) and data compaction.
Below is our current thinking on future pricing. We’ll communicate more details around timing well before billing begins, so you can confidently plan your workloads.
\n
\n \n \n \n \n \n \n \n | \n \n Pricing \n | \n
\n \n \n R2 storage \n For standard storage class \n | \n \n $0.015 per GB-month (no change) \n | \n
\n \n \n R2 Class A operations \n | \n \n $4.50 per million operations (no change) \n | \n
\n \n \n R2 Class B operations \n | \n \n $0.36 per million operations (no change) \n | \n
\n \n \n Data Catalog operations \n e.g., create table, get table metadata, update table properties \n | \n \n $9.00 per million catalog operations \n | \n
\n \n \n Data Catalog compaction data processed \n | \n \n $0.05 per GB processed \n $4.00 per million objects processed \n | \n
\n \n \n Data egress \n | \n \n $0 (no change, always free) \n | \n
\n \n
\n \n
\n \n
What’s next?
\n
\n \n \n
\n We’re excited to see how you use R2 Data Catalog! If you’ve never worked with Iceberg – or even analytics data – before, we think this is the easiest way to get started.
Next on our roadmap is tackling compaction and table optimization. Query engines typically perform better when dealing with fewer, but larger data files. We will automatically re-write collections of small data files into larger files to deliver even faster query performance.
We’re also collaborating with the broad Apache Iceberg community to expand query-engine compatibility with the Iceberg REST Catalog spec.
We’d love your feedback. Join the Cloudflare Developer Discord to ask questions and share your thoughts during the public beta. For more details, examples, and guides, visit our developer documentation.
"],"published_at":[0,"2025-04-10T14:00+00:00"],"updated_at":[0,"2025-04-16T15:43:33.601Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3DLiKvo4JZk27ITbyBcZHm/b49fe94354e7f89b5e55cdf722fe2732/Feature_Image.png"],"tags":[1,[[0,{"id":[0,"2xCnBweKwOI3VXdYsGVbMe"],"name":[0,"Developer Week"],"slug":[0,"developer-week"]}],[0,{"id":[0,"419aJYheeNglKZlN8yunB6"],"name":[0,"R2"],"slug":[0,"r2"]}],[0,{"id":[0,"aHsK2p2ryRcfUSs4nMge0"],"name":[0,"Data Catalog"],"slug":[0,"data-catalog"]}],[0,{"id":[0,"7lB8a8hOPXzjt99X5Ye9wb"],"name":[0,"Storage"],"slug":[0,"storage"]}],[0,{"id":[0,"3JAY3z7p7An94s6ScuSQPf"],"name":[0,"Developer Platform"],"slug":[0,"developer-platform"]}],[0,{"id":[0,"6QktrXeEFcl4e2dZUTZVGl"],"name":[0,"Product News"],"slug":[0,"product-news"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Phillip Jones"],"slug":[0,"phillip"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5KTNNpw9VHuwoZlwWsA7MN/c50f3f98d822a0fdce3196d7620d714e/phillip.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,"@akaphill"],"facebook":[0,null],"publiclyIndex":[0,true]}],[0,{"name":[0,"Garvit Gupta"],"slug":[0,"garvit-gupta"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5bHj4wTbnOq2jqn0UloCz8/061a096c9f0e2c7dafafa36eccff32d1/Garvit_Gupta.jpg"],"location":[0],"website":[0,"linkedin.com/in/garvitg/"],"twitter":[0],"facebook":[0],"publiclyIndex":[0,true]}],[0,{"name":[0,"Alex Graham"],"slug":[0,"alex-graham"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7JHQi5kSLLLe5Xv3UF3Wpu/78b42f6f5628c41a83ac08c537cda62f/_tmp_mini_magick20240416-2-nemxat.jpg"],"location":[0],"website":[0],"twitter":[0],"facebook":[0],"publiclyIndex":[0,true]}],[0,{"name":[0,"Garrett Gu"],"slug":[0,"garrett-gu"],"bio":[0,"Passionate about compilers, security, web, and audio."],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/73rzJLaDaPLOFGzOEVYmsX/d0d6b2e6f8d19e70bc3d35fb52ed682a/garrett-gu.jpg"],"location":[0,"Austin, TX"],"website":[0,"garrettgu.com"],"twitter":[0,null],"facebook":[0,null],"publiclyIndex":[0,true]}]]],"meta_description":[0,"R2 Data Catalog is now in public beta: a managed Apache Iceberg data catalog built directly into your R2 bucket."],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/r2-data-catalog-public-beta"],"metadata":[0,{"title":[0,"R2 Data Catalog: Managed Apache Iceberg tables with zero egress fees"],"description":[0,"R2 Data Catalog is now in public beta: a managed Apache Iceberg data catalog built directly into your R2 bucket."],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3INh24m5qXpQPIsiXNpIzt/220e0b83803496d3724d5698cf1ce55e/OG_Share_2024__42_.png"]}],"publicly_index":[0,true]}],[0,{"id":[0,"4LTq8Utw6K58W4ojKxsqw8"],"title":[0,"“You get Instant Purge, and you get Instant Purge!” — all purge methods now available to all customers"],"slug":[0,"instant-purge-for-all"],"excerpt":[0,"Following up on having the fastest purge in the industry, we have now increased Instant Purge quotas across all Cloudflare plans. "],"featured":[0,false],"html":[0,"There's a tradition at Cloudflare of launching real products on April 1, instead of the usual joke product announcements circulating online today. In previous years, we've introduced impactful products like 1.1.1.1 and 1.1.1.1 for Families. Today, we're excited to continue this tradition by making every purge method available to all customers, regardless of plan type.
During Birthday Week 2024, we announced our intention to bring the full suite of purge methods — including purge by URL, purge by hostname, purge by tag, purge by prefix, and purge everything — to all Cloudflare plans. Historically, methods other than "purge by URL" and "purge everything" were exclusive to Enterprise customers. However, we've been openly rebuilding our purge pipeline over the past few years (hopefully you’ve read some of our blog series), and we're thrilled to share the results more broadly. We've spent recent months ensuring the new Instant Purge pipeline performs consistently under 150 ms, even during increased load scenarios, making it ready for every customer.
But that's not all — we're also significantly raising the default purge rate limits for Enterprise customers, allowing even greater purge throughput thanks to the efficiency of our newly developed Instant Purge system.
\n \n
Building a better purge: a two-year journey
\n
\n \n \n
\n Stepping back, today's announcement represents roughly two years of focused engineering. Near the end of 2022, our team went heads down rebuilding Cloudflare’s purge pipeline with a clear yet challenging goal: dramatically increase our throughput while maintaining near-instant invalidation across our global network.
Cloudflare operates data centers in over 335 cities worldwide. Popular cached assets can reside across all of our data centers, meaning each purge request must quickly propagate to every location caching that content. Upon receiving a purge command, each data center must efficiently locate and invalidate cached content, preventing stale responses from being served. The amount of content that must be invalidated can vary drastically, from a single file, to all cached assets associated with a particular hostname. After the content has been purged, any subsequent requests will trigger retrieval of a fresh copy from the origin server, which will be stored in Cloudflare’s cache during the response.
Ensuring consistent, rapid propagation of purge requests across a vast network introduces substantial technical challenges, especially when accounting for occasional data center outages, maintenance, or network interruptions. Maintaining consistency under these conditions requires robust distributed systems engineering.
\n \n
How did we scale purge?
\n
\n \n \n
\n We've previously discussed how our new Instant Purge system was architected to achieve sub-150 ms purge times. It’s worth noting that the performance improvements were only part of what our new architecture achieved, as it also helped us solve significant scaling challenges around storage and throughput that allowed us to bring Instant Purge to all users.
Initially, our purge system scaled well, but with rapid customer growth, the storage consumption from millions of daily purge keys that needed to be stored reduced available caching space. Early attempts to manage this storage and throughput demand involved queues and batching for smoothing traffic spikes, but this introduced latency and underscored the tight coupling between increased usage and rising storage costs.
We needed to revisit our thinking on how to better store purge keys and when to remove purged content so we could reclaim space. Historically, when a customer would purge by tag, prefix or hostname, Cloudflare would mark the content as expired and allow it to be evicted later. This is known as lazy-purge because nothing is actively removed from disk. Lazy-purge is fast, but not necessarily efficient, because it consumes storage for expired but not-yet-evicted content. After examining global or data center-level indexing for purge keys, we decided that wasn't viable due to increases in system complexity and the latency those indices could bring due to our network size. So instead, we opted for per-machine indexing, integrating indices directly alongside our cache proxies. This minimized network complexity, simplified reliability, and provided predictable scaling.
After careful analysis and benchmarking, we selected RocksDB, an embedded key-value store that we could optimize for our needs, which formed the basis of CacheDB, our Rust-based service running alongside each cache proxy. CacheDB manages indexing and immediate purge execution (active purge), significantly reducing storage needs and freeing space for caching.
\n \n \n Local queues within CacheDB buffer purge operations to ensure consistent throughput without latency spikes, while the cache proxies consult CacheDB to guarantee rapid, active purges. Our updated distribution pipeline broadcasts purges directly to CacheDB instances across machines, dramatically improving throughput and purge speed.
Using CacheDB, we've reduced storage requirements 10x by eliminating lazy purge storage accumulation, instantly freeing valuable disk space. The freed storage enhances cache retention, boosting cache HIT ratios and minimizing origin egress. These savings in storage and increased throughput allowed us to scale to the point where we can offer Instant Purge to more customers.
For more information on how we designed the new Instant Purge system, please see the previous installment of our Purge series blog posts.
\n \n
Striking the right balance: what to purge and when
\n
\n \n \n
\n Moving on to practical considerations of using these new purge methods, it’s important to use the right method for what you want to invalidate. Purging too aggressively can overwhelm origin servers with unnecessary requests, driving up egress costs and potentially causing downtime. Conversely, insufficient purging leaves visitors with outdated content. Balancing precision and speed is vital.
Cloudflare supports multiple targeted purge methods to help customers achieve this balance.
Starting today, all of these methods are available to every Cloudflare customer.
\n \n
How to purge
\n
\n \n \n
\n Users can select their purge method directly in the Cloudflare dashboard, located under the Cache tab in the configurations section, or via the Cloudflare API. Each purge request should clearly specify the targeted URLs, hostnames, prefixes, or cache tags relevant to the selected purge type (known as purge keys). For instance, a prefix purge request might specify a directory such as example.com/foo/bar. To maximize efficiency and throughput, batching multiple purge keys in a single request is recommended over sending individual purge requests each with a single key.
\n \n
How much can you purge?
\n
\n \n \n
\n The new rate limits for Cloudflare's purge by tag, prefix, hostname, and purge everything are different for each plan type. We use a token bucket rate limit system, so each account has a token bucket with a maximum size based on plan type. When we receive a purge request we first add tokens to the account’s bucket based on the time passed since the account’s last purge request divided by the refill rate for its plan type (which can be a fraction of a token). Then we check if there’s at least one whole token in the bucket, and if so we remove it and process the purge request. If not, the purge request will be rate limited. An easy way to think about this rate limit is that the refill rate represents the consistent rate of requests a user can send in a given period while the bucket size represents the maximum burst of requests available.
For example, a free user starts with a bucket size of 25 requests and a refill rate of 5 requests per minute (one request per 12 seconds). If the user were to send 26 requests all at once, the first 25 would be processed, but the last request would be rate limited. They would need to wait 12 seconds and retry their last request for it to succeed.
The current limits are applied per account:
Plan | Bucket size | Request refill rate | Max keys per request | Total keys |
Free | 25 requests | 5 per minute | 100 | 500 per minute |
Pro | 25 requests | 5 per second | 100 | 500 per second |
Biz | 50 requests | 10 per second | 100 | 1,000 per second |
Enterprise | 500 requests | 50 per second | 100 | 5,000 per second |
More detailed documentation on all purge rate limits can be found in our documentation.
\n \n
What’s next?
\n
\n \n \n
\n We’ve spent a lot of time optimizing our purge platform. But we’re not done yet. Looking forward, we will continue to enhance the performance of Cloudflare’s single-file purge. The current P50 performance is around 250 ms, and we suspect that we can optimize it further to bring it under 200 ms. We will also build out our ability to allow for greater purge throughput for all of our systems, and will continue to find ways to implement filtering techniques to ensure we can continue to scale effectively and allow customers to purge whatever and whenever they choose.
We invite you to try out our new purge system today and deliver an instant, seamless experience to your visitors.
"],"published_at":[0,"2025-04-01T14:00+00:00"],"updated_at":[0,"2025-04-10T06:22:05.804Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/65pxEF0Emq9kVoxKhD6lP3/da052dd60681437575753d231a98b22c/image1.png"],"tags":[1,[[0,{"id":[0,"5RrjSR5vIOJAfRdT8966hf"],"name":[0,"Cache"],"slug":[0,"cache"]}],[0,{"id":[0,"48r7QV00gLMWOIcM1CSDRy"],"name":[0,"Speed & Reliability"],"slug":[0,"speed-and-reliability"]}],[0,{"id":[0,"4gN0ARax0fHxjtZL07THOe"],"name":[0,"Performance"],"slug":[0,"performance"]}],[0,{"id":[0,"7m1u8osPfYlB9uhotDfTol"],"name":[0,"Cache Purge"],"slug":[0,"cache-purge"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Alex Krivit"],"slug":[0,"alex"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1e9gtWsffhoAWG0rTN45cR/35204e3d77d1a90feeb57a32e388daca/alex.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,"@ackriv"],"facebook":[0,null],"publiclyIndex":[0,true]}],[0,{"name":[0," Connor Harwood"],"slug":[0,"connor-harwood"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7ChQBYRPNIiEuQnGQdajbt/3b67f09fd6dae792eb8a1ffafaa1254a/_tmp_mini_magick20221017-43-y5gfah.jpg"],"location":[0],"website":[0],"twitter":[0],"facebook":[0],"publiclyIndex":[0,true]}],[0,{"name":[0,"Zaidoon Abd Al Hadi"],"slug":[0,"zaidoon"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3Ber7SvflfzvyO5J611w7l/59f62d05e428afdd88b10a82857527c0/zaidoon.png"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null],"publiclyIndex":[0,true]}]]],"meta_description":[0,"Following up on having the fastest purge in the industry, we have now increased Instant Purge quotas across all Cloudflare plans. "],"primary_author":[0,{}],"localeList":[0,{"name":[0,"LOC: You get Instant Purge, and you get Instant Purge"],"enUS":[0,"English for Locale"],"zhCN":[0,"Translated for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"Translated for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"Translated for Locale"],"koKR":[0,"Translated for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/instant-purge-for-all"],"metadata":[0,{"title":[0,"“You get Instant Purge, and you get Instant Purge!” — all purge methods now available to all customers"],"description":[0,"Following up on having the fastest purge in the industry, we have now increased Instant Purge quotas across all Cloudflare plans. "],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6serphOZ0YKsEL6E3EEiLp/3804087e55df56ca1c5dfcdc909cf181/_You_get_Instant_Purge__and_you_get_Instant_Purge_____all_purge_methods_now_available_to_all_customers-OG.png"]}],"publicly_index":[0,true]}],[0,{"id":[0,"76XBFojN0mhfyCoz6VRe1G"],"title":[0,"Cloudflare enables native monitoring and forensics with Log Explorer and custom dashboards"],"slug":[0,"monitoring-and-forensics"],"excerpt":[0,"Today we are excited to announce support for Zero Trust datasets, and custom dashboards where customers can monitor critical metrics for suspicious or unusual activity. "],"featured":[0,false],"html":[0,"In 2024, we announced Log Explorer, giving customers the ability to store and query their HTTP and security event logs natively within the Cloudflare network. Today, we are excited to announce that Log Explorer now supports logs from our Zero Trust product suite. In addition, customers can create custom dashboards to monitor suspicious or unusual activity.
Every day, Cloudflare detects and protects customers against billions of threats, including DDoS attacks, bots, web application exploits, and more. SOC analysts, who are charged with keeping their companies safe from the growing spectre of Internet threats, may want to investigate these threats to gain additional insights on attacker behavior and protect against future attacks. Log Explorer, by collecting logs from various Cloudflare products, provides a single starting point for investigations. As a result, analysts can avoid forwarding logs to other tools, maximizing productivity and minimizing costs. Further, analysts can monitor signals specific to their organizations using custom dashboards.
\n \n
Zero Trust dataset support in Log Explorer
\n
\n \n \n
\n Log Explorer stores your Cloudflare logs for a 30-day retention period so that you can analyze them natively and in a single interface, within the Cloudflare Dashboard. Cloudflare log data is diverse, reflecting the breadth of capabilities available. For example, HTTP requests contain information about the client such as their IP address, request method, autonomous system (ASN), request paths, and TLS versions used. Additionally, Cloudflare’s Application Security WAF Detections enrich these HTTP request logs with additional context, such as the WAF attack score, to identify threats.
Today we are announcing that seven additional Cloudflare product datasets are now available in Log Explorer. These seven datasets are the logs generated from our Zero Trust product suite, and include logs from Access, Gateway DNS, Gateway HTTP, Gateway Network, CASB, Zero
Trust Network Session, and Device Posture Results. Read on for examples of how to use these logs to identify common threats.
\n \n
Investigating unauthorized access
\n
\n \n \n
\n By reviewing Access logs and HTTP request logs, we can reveal attempts to access resources or systems without proper permissions, including brute force password attacks, indicating potential security breaches or malicious activity.
Below, we filter Access Logs on the Allowed
field, to see activity related to unauthorized access.
\n \n \n By then reviewing the HTTP logs for the requests identified in the previous query, we can assess if bot networks are the source of unauthorized activity.
\n \n \n With this information, you can craft targeted Custom Rules to block the offending traffic.
\n \n
Detecting malware
\n
\n \n \n
\n Cloudflare's Web Gateway can track which websites users are accessing, allowing administrators to identify and block access to malicious or inappropriate sites. These logs can be used to detect if a user’s machine or account is compromised by malware attacks. When reviewing logs, this may become apparent when we look for records that show a rapid succession of attempts to browse known malicious sites, such as hostnames that have long strings of seemingly random characters that hide their true destination. In this example, we can query logs looking for requests to a spoofed YouTube URL.
\n \n \n \n \n
Monitoring what matters using custom dashboards
\n
\n \n \n
\n Security monitoring is not one size fits all. For instance, companies in the retail or financial industries worry about fraud, while every company is concerned about data exfiltration, of information like trade secrets. And any form of personally identifiable information (PII) is a target for data breaches or ransomware attacks.
While log exploration helps you react to threats, our new custom dashboards allow you to define the specific metrics you need in order to monitor threats you are concerned about.
Getting started is easy, with the ability to create a chart using natural language. A natural language interface is integrated into the chart create/edit experience, enabling you to describe in your own words the chart you want to create. Similar to the AI Assistant we announced during Security Week 2024, the prompt translates your language to the appropriate chart configuration, which can then be added to a new or existing custom dashboard.
Use a prompt: Enter a query like “Compare status code ranges over time”. The AI model decides the most appropriate visualization and constructs your chart configuration.
Customize your chart: Select the chart elements manually, including the chart type, title, dataset to query, metrics, and filters. This option gives you full control over your chart’s structure.
\n \n
\n
Video shows entering a natural language description of desired metric “compare status code ranges over time”, preview chart shown is a time series grouped by error code ranges, selects “add chart” to save to dashboard.
For more help getting started, we have some pre-built templates that you can use for monitoring specific uses. Available templates currently include:
Bot monitoring: Identify automated traffic accessing your website
API Security: Monitor the data transfer and exceptions of API endpoints within your application
API Performance: See timing data for API endpoints in your application, along with error rates
Account Takeover: View login attempts, usage of leaked credentials, and identify account takeover attacks
Performance Monitoring: Identify slow hosts and paths on your origin server, and view time to first byte (TTFB) metrics over time
Templates provide a good starting point, and once you create your dashboard, you can add or remove individual charts using the same natural language chart creator.
\n \n
\n
Video shows editing chart from an existing dashboard and moving individual charts via drag and drop.
\n \n
Example use cases
\n
\n \n \n
\n Custom dashboards can be used to monitor for suspicious activity, or to keep an eye on performance and errors for your domains. Let’s explore some examples of suspicious activity that we can monitor using custom dashboards.
Take, for example, our use case from above: investigating unauthorized access. With custom dashboards, you can create a dashboard using the Account takeover template to monitor for suspicious login activity related to your domain.
\n \n \n As another example, spikes in requests or errors are common indicators that something is wrong, and they can sometimes be signals of suspicious activity. With the Performance Monitoring template, you can view origin response time and time to first byte metrics as well as monitor for common errors. For example, in this chart, the spikes in 404 errors could be an indication of an unauthorized scan of your endpoints.
\n \n \n \n \n When using custom dashboards, if you observe a traffic pattern or spike in errors that you would like to further investigate, you can click the button to “View in Security Analytics” in order to drill down further into the data and craft custom WAF rules to mitigate the threat.
\n \n \n These tools, seamlessly integrated into the Cloudflare platform, will enable users to discover, investigate, and mitigate threats all in one place, reducing time to resolution and overall cost of ownership by eliminating the need to forward logs to third party security analysis tools. And because it is a native part of Cloudflare, you can immediately use the data from your investigation to craft targeted rules that will block these threats.
\n \n
What’s next
\n
\n \n \n
\n Stay tuned as we continue to develop more capabilities in the areas of observability and forensics, with additional features including:
Custom alerts: create alerts based on specific metrics or anomalies
Scheduled query detections: craft log queries and run them on a schedule to detect malicious activity
More integration: further streamlining the journey between detect, investigate, and mitigate across the full Cloudflare platform.
\n \n
How to get it
\n
\n \n \n
\n Current Log Explorer beta users get immediate access to the new custom dashboards feature. Pricing will be made available to everyone during Q2 2025. Between now and then, these features continue to be available at no cost.
Let us know if you are interested in joining our Beta program by completing this form, and a member of our team will contact you.
\n \n
Watch on Cloudflare TV
\n
\n \n \n
\n \n \n
"],"published_at":[0,"2025-03-18T13:00+00:00"],"updated_at":[0,"2025-03-18T13:00:02.756Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/jFjCKR8OeZ4CYkF6ZNtHj/20cc8613fbc97f0833d48e4d05b09b8a/image4.png"],"tags":[1,[[0,{"id":[0,"3DmitkNK6euuD5BlhuvOLW"],"name":[0,"Security Week"],"slug":[0,"security-week"]}],[0,{"id":[0,"2OotqBxtRdi5MuC90AlyxE"],"name":[0,"Analytics"],"slug":[0,"analytics"]}],[0,{"id":[0,"4fkY3bvsgn5JfTgXxTZHIR"],"name":[0,"Logs"],"slug":[0,"logs"]}],[0,{"id":[0,"6Mp7ouACN2rT3YjL1xaXJx"],"name":[0,"Security"],"slug":[0,"security"]}],[0,{"id":[0,"7JpaihvGGjNhG2v4nTxeFV"],"name":[0,"R2 Storage"],"slug":[0,"cloudflare-r2"]}],[0,{"id":[0,"4lvuWnOXVvUOUeWhonoBGO"],"name":[0,"SIEM"],"slug":[0,"siem"]}],[0,{"id":[0,"6QktrXeEFcl4e2dZUTZVGl"],"name":[0,"Product News"],"slug":[0,"product-news"]}],[0,{"id":[0,"5OywGP63AdM9Umyvaku8OP"],"name":[0,"Connectivity Cloud"],"slug":[0,"connectivity-cloud"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Jen Sells"],"slug":[0,"jen-sells"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/pRGGBW4TY7QlqKC52En4J/f4c0208797faba8a8b3d9a37c9a5b3f8/jen-sells.JPG"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null],"publiclyIndex":[0,true]}]]],"meta_description":[0,"Log Explorer provides the ability to store and query Cloudflare logs natively within the Cloudflare network. Today we are excited to announce support for Zero Trust datasets, and custom dashboards where customers can monitor critical metrics for suspicious or unusual activity. "],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/monitoring-and-forensics"],"metadata":[0,{"title":[0,"Cloudflare enables native monitoring and forensics with Log Explorer and custom dashboards"],"description":[0,"Log Explorer provides the ability to store and query Cloudflare logs natively within the Cloudflare network. Today we are excited to announce support for Zero Trust datasets, and custom dashboards where customers can monitor critical metrics for suspicious or unusual activity. "],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/35K4zATo6dlaiwhWp74o7i/c86ff05e29331c536f9b5a204eb0a806/Cloudflare_enables_native_monitoring_and_forensics_with_Log_Explorer_and_custom_dashboards-OG.png"]}],"publicly_index":[0,true]}],[0,{"id":[0,"48d5TNR7SLaks6NJ9LGb77"],"title":[0,"Banish bots from your Waiting Room and improve wait times for real users"],"slug":[0,"banish-bots-from-your-waiting-room-and-improve-wait-times-for-real-users"],"excerpt":[0,"Cloudflare Waiting Room is improving the user experience through the addition of Turnstile and Session Revocation, keeping wait times low and protecting against bot traffic."],"featured":[0,false],"html":[0,"With Cloudflare Waiting Room, you can safeguard your site from traffic surges by placing visitors in a customizable, virtual queue. Previously, many site visitors waited in the queue alongside bots, only to find themselves competing for inventory once in the application. This competition is inherently unfair, as bots are much faster and more efficient than humans. As a result, humans inevitably lose out in these high-demand situations, unable to secure inventory before bots sweep it all up. This creates a frustrating experience for real customers, who feel powerless against the speed and automation of bots, leading to a diminished experience overall. Those days are over! Today, we are thrilled to announce the launch of two Waiting Room solutions that significantly improve the visitor experience.
Now, all Waiting Room customers can add an invisible Turnstile challenge to their queueing page, robustly challenging traffic and gathering analytics on bot activity within their queue. With Advanced Waiting Rooms, you can select between an invisible, managed, or non-interactive widget mode. But, we won’t just block these bots! Instead, traffic with definite bot signals that have failed the Turnstile challenge can be sent to an Infinite Queue, a completely customizable page that mimics a real user experience. This prolongs the time it takes bots to realize they have not actually joined the queue, wasting their resources without impacting real users. This feature not only protects your site against bots, but also reduces wait times and protects inventory by ensuring the queue only consists of genuine users. To offset the environmental impact of wasting bot resources, we’re contributing to a tree planting initiative, helping to reduce the carbon footprint of inefficient bots.
The second solution we have launched to improve the visitor experience is Session Revocation, which allows you to end a user’s session based on an action, dynamically opening up spots and admitting users from the queue. This new capability allows you to integrate Waiting Room more seamlessly with your customer journey, resulting in increased throughput, decreased wait times, and increased fairness by giving more users the opportunity to make it through the queue during high demand events.
This feature has proven to be extremely impactful for our customers, including a large online retailer that frequently has high-demand limited edition product drops. A common challenge in this space is maximizing the number of customers who can make a purchase during a limited-time event, all while maintaining a fair and efficient system for everyone involved. Previously, this customer had to limit their users to only one item in the cart and force them to wait for a period of time after each checkout before allowing them to rejoin the queue. This led to an awkward experience for end users, longer wait times, and reduced site throughput. With session revocation, this online retailer can now end the user’s session immediately after a purchase is complete, placing the user back in the queue if applicable, without being forced to wait for a preset timeout period. This significantly improves the end user experience by reducing unnecessary wait times and streamlining the purchase process.
Let’s deep dive into these two capabilities and how they improve the overall user experience.
\n \n
How bots impact the Waiting Room user experience
\n
\n \n \n
\n Waiting Room is often used to protect sites from being overwhelmed by traffic surges during high demand online events. These high demand events, such as ticket or e-commerce product sales, attract both a deluge of genuine users, and sophisticated bots, such as scalper bots. This type of bot traffic is unique, as they can complete the checkout process or user journey much quicker than normal human traffic. Bots in the queue negatively affect the user experience by increasing wait times, as they often occupy multiple spots. Additionally, their behavior can exacerbate the issue — if they don't handle cookies properly, they fail to take their spot in the application when their turn comes, further preventing the queue from progressing smoothly. Once past the queue, bots can also contribute to inventory hoarding, as they often reserve or consume large quantities of stock without genuine intent to purchase. An example of this is the PlayStation 5’s launch in November 2020. Due to high demand and production limitations during the COVID-19 pandemic, scalper bots bought up stock quickly, making it difficult for average consumers to purchase the console at retail prices. This led to extreme frustration for retailers and consumers as these bots drove the prices up significantly.
\n \n
Quantifying bot traffic to Waiting Room with an invisible Turnstile challenge
\n
\n \n \n
\n Waiting Room customers have long been curious about the nature of large traffic spikes. Historically, bot scores and managed challenges have been the primary methods of collecting this data and acting on it. While these can provide some insight into the distribution of traffic, the Turnstile invisible challenge gives us the ability to actively interrogate the browser, providing the most complete set of data on whether that browser is being operated by a human or a bot.
To start quantifying bot traffic to waiting rooms, we have added an invisible Turnstile challenge to all basic rooms. With the purchase of an Advanced Waiting Room, customers can select between invisible, managed, or non-interactive widget modes. This Turnstile team blog post has more details on the different widget modes.
Waiting Room’s integration with Turnstile aims to protect your site with minimal impact to the user experience by placing a Turnstile challenge on your waiting room’s queuing page. Unlike a standard WAF challenge, the Waiting Room Turnstile challenge is presented only when the waiting room is queuing. This way, users won’t face any interruptions once they are past the queue and into the application.
\n \n \n With an advanced waiting room, you can configure the type of Turnstile challenge from the Cloudflare dashboard and API.
From the analytics we’ve gathered with the invisible Turnstile challenge on all basic waiting rooms, we’ve been able to determine that many large traffic spikes come from user agents that don’t even attempt to run the challenge, leaving it unsolved. In other words, we send the challenge widget in the HTML for the queuing page, but sometimes those challenges never get completed. By subtracting the number of times we see solved challenges from the total number of times we send challenges, we can get a count of requests that are likely from unsophisticated bots. These requests are reported to Waiting Room Analytics as “Likely Bots.” We’ve seen small businesses with low baseline traffic hit with tens of thousands of such requests (or more) in a short period of time. When a large influx of non-human traffic like this comes in, every visitor to the website ends up queued in a waiting room, not just the bots.
These bots could be any software that simply sends out HTTP requests. This data can help determine whether a traffic spike and subsequent queueing is coming from real human users, or a bunch of simple bots that don’t even bother to run JavaScript.
With the Turnstile integration, we are also catching sophisticated bots. While many of the bots we see don’t attempt to run the challenge, there are a few that do. Detecting these bots is more difficult than detecting simple bots that don’t run JavaScript. The Turnstile widget runs a series of checks against the browser to find evidence that a browser isn’t being operated by a human, and is instead being driven by something like Selenium. If Turnstile isn’t able to determine that the browser is being operated by a human, we count that as a failed challenge and report those users to Waiting Room Analytics as “Bots,” since we are quite confident that these users are not human.
About 1 in 20 “users” that run the challenge end up not passing. Just like the previously mentioned unsophisticated bots, these more sophisticated bots inflate the size of the queue, making it more difficult for real humans to make it through to your website.
The remaining 19 in 20 “users” that successfully pass the challenge are counted in Waiting Room Analytics as “Likely Humans.”
These new metrics related to Turnstile challenge outcomes are available in your Waiting Room Analytics dashboard and the analytics GraphQL API, so you can see the distribution of bot to human traffic in your waiting room. Once you know what your traffic looks like, the real question is: what can you do about it?
\n \n \n View the distribution of traffic and challenges issued in Waiting Room Analytics
\n \n
New Infinite Queue feature
\n
\n \n \n
\n Beyond logging your Turnstile challenge outcomes, Advanced Waiting Room customers have the option to select the Infinite Queue feature. With this feature, all traffic that fails the Turnstile challenge, such as a bot, will be sent to an Infinite Queue page. The Infinite Queue matches the normal queuing experience, prolonging the time it takes the bot to recognize they are being blocked and effectively consuming their resources. While the Infinite Queue will have the same look and feel as the Waiting Room page, the bot is not actually a part of the real queue.
With the infinite Queue enabled, all traffic will have to pass the challenge to enter the real queue. By blocking bots from joining the queue, we will reduce wait times for humans and prevent bots from using up server resources during a traffic spike.
\n \n \n Enable the Infinite Queue option through the Cloudflare dashboard or API.
Bots will be none the wiser, wasting their time and resources waiting in an infinite queue that will never get them to where they’re trying to go.
We keep track of the traffic hitting the infinite queue, counting the number of times they refresh their queuing page in Waiting Room Analytics. This appears as the “infinite queue refreshes” count in the analytics dash and GraphQL API. This metric gives you a good idea of the amount of time these bots have wasted trying to reach your website.
\n \n
How Waiting Room integrates with Turnstile
\n
\n \n \n
\n Turnstile is a powerful and versatile product that anyone, Cloudflare and others alike, can use to build systems to thwart bot traffic. Waiting Room integrates Turnstile the same as any other Turnstile user.
\n <!DOCTYPE html>\n<html>\n\t<head>\n\t\t<title>Waiting Room</title>\n\t</head>\n\t<body>\n\t\t<h1>You are currently in the queue.</h1>\n\t\t{{#waitTimeKnown}}\n\t\t\t<h2>Your estimated wait time is {{waitTimeFormatted}}.</h2>\n\t\t{{/waitTimeKnown}}\n\t\t{{^waitTimeKnown}}\n\t\t\t<h2>Your estimated wait time is unknown.</h2>\n\t\t{{/waitTimeKnown}}\n\t\t{{#turnstile}}\n\t\t\t<!-- for a managed (and potentially interactive) challenge, you may want to instruct the user to complete the challenge -->\n\t\t\t<p>Please complete this challenge so we know you're a human:</p>\n\t\t\t{{{turnstile}}} <!-- include the turnstile widget -->\n\t\t{{/turnstile}}\n\t</body>\n</html>
\n The Turnstile widget can be embedded in custom queuing page templates by including the {{{turnstile}}}
variable.
\n <!DOCTYPE html>\n<html>\n\t<head>\n\t\t<title>Waiting Room</title>\n\t</head>\n\t<body>\n\t\t{{#turnstile}}\n\t\t\t<h1>This website is currently using a waiting room.</h1>\n\t\t\t<p>We use a Turnstile challenge to ensure you aren't waiting in line behind bots. Complete this challenge to enter the queue.</p>\n\t\t\t{{{turnstile}}} <!-- include the turnstile widget -->\n\t\t{{/turnstile}}\n\t\t{{^turnstile}}\n\t\t\t<h1>You are currently in the queue.</h1>\n\t\t\t{{#waitTimeKnown}}\n\t\t\t\t<h2>Your estimated wait time is {{waitTimeFormatted}}.</h2>\n\t\t\t{{/waitTimeKnown}}\n\t\t\t{{^waitTimeKnown}}\n\t\t\t\t<h2>Your estimated wait time is unknown.</h2>\n\t\t\t{{/waitTimeKnown}}\n\t\t{{/turnstile}}\n\t</body>\n</html>
\n When using Infinite Queue (especially with managed challenges which may be interactive), you may want to tell users they will not be in the queue until they complete the challenge.
We embed a plain Turnstile challenge in the queuing page by passing the HTML to the queuing page template in a turnstile
variable. The default queuing page template and any newly created custom templates include this variable already. If you have an existing custom HTML template and wish to enable the Turnstile integration, you will need to add {{{turnstile}}}
somewhere in the template to tell Waiting Room where the widget should be placed. Waiting Room uses Mustache templates, so including raw HTML within your template without escaping requires three curly braces instead of two.
\n \n \n A managed Turnstile challenge on the default Waiting Room queuing page template
Once the challenge completes, fails, or times out, the page refreshes and passes the Turnstile token to Waiting Room’s worker. Next, we check in with Turnstile’s siteverify endpoint to make sure the challenge was successful. From there, we report the outcome to the Waiting Room’s analytics and optionally send failed traffic (bots) to an infinite queue.
The infinite queue itself is designed to be as close to normal queuing as possible. When a bot is sent to the infinite queue, we issue it a cookie which looks like a normal waiting room cookie. Inside the cookie’s encryption though, we have a boolean flag that tells our worker to send the bot’s requests to the infinite queue. When we see that flag, we skip all the normal queuing logic and just render a queuing page.
That queuing page shows a fake estimated time remaining. It’s based on an asymptotic curve which appears to decrease linearly from the start. As time goes on, the curve gets flatter (and progress through the “queue” gets slower), so the estimated time remaining never quite reaches 0.
\n \n \n This graph is an approximation of the time remaining (y-axis, minutes) that bots will see, compared to the amount of time they’ve waited in the infinite queue (x-axis, minutes).
We reuse much of the same code for rendering the queuing page for the infinite queue and the normal queue. We do this to reduce the amount of signal bots may have that they are in the infinite queue rather than the normal queue.
\n let cookie\nif (query['cf_wr_turnstile']) {\n const turnstileToken = query['cf_wr_turnstile']\n const tokenOk = await siteverify(turnstileToken)\n if (tokenOk) {\n analytics.turnstileSuccesses++\n cookie = newCookie()\n } else {\n analytics.turnstileFailures++\n cookie = { infiniteQueuing: true }\n }\n response.headers['Set-Cookie'] = encryptCookie(cookie)\n}\nif (!cookie) {\n cookie = decryptCookie(headers['Cookie'])\n}\nif (!cookie) {\n analytics.turnstileChallenges++\n return await queuingPage(await estimateTimeRemaining(), { turnstileChallenge: true })\n} else if (cookie.infiniteQueuing) {\n analytics.infiniteQueueRequests++\n return await queuingPage(fakeTimeRemaining())\n} else if (cookie.accepted) {\n return await sendToOrigin()\n} else {\n // run Waiting Room's distributed queuing logic to check whether\n // this user has made it to the front of the queue, but only after\n // the user has completed a Turnstile challenge and isn't in the\n // fake infinite queue\n const { letThrough, timeRemaining } = calculateQueuing(cookie)\n if (letThrough) {\n cookie.accepted = true\n response.headers['Set-Cookie'] = encryptCookie(cookie)\n return await sendToOrigin()\n } else {\n return await queuingPage(timeRemaining)\n }\n}
\n Approximate psuedocode for how we handle incoming requests when infinite queue is enabled in the Waiting Room worker
Thanks to the versatility of Turnstile, we only needed to rely on public Turnstile APIs to build this integration.
Adding Turnstile to Waiting Room is a proactive step in managing traffic that directly contributes to a smoother, faster experience for end users. Building on that efficiency, let’s dive into how you can add an additional layer of control to increase throughput and minimize wait times for your customers.
\n \n
Further improve wait times using session revocation
\n
\n \n \n
\n We have talked extensively in a previous blog post about how we queue users with respect to the current active users on the application and the defined limits, and, in the same blog post, what state and calculations we use to determine the amount of total active users. Here is a quick summary for those who have not read that post:
When a user navigates to a page behind a waiting room, they receive a cookie and are associated with a time period called a bucket. We use these buckets to track the number of users either waiting in the queue or accessing the application for that specific time period. Whenever a user makes a request, we move their session from their previous bucket to the latest bucket. Once a bucket is older than the configured session duration, we know that those user sessions are no longer valid (expired) and we can clean up those values. Thus, that user session expires, and new slots are opened for the next users to enter the application.
These buckets are aggregated at Cloudflare data centers and then globally via the internal state of the waiting room, which is structured as multiple CRDT counters and registers. This allows us to merge the distributed state of the waiting room stored in multiple data centers as a single global state without conflicts.
To calculate the total active users on an application, we first merge the state from all data centers. Then, we sum the active users for all the buckets where a session can still be active.
\n \n \n Because the Waiting Room runs per user request, we do not explicitly know when a user has stopped accessing the application, and instead we only stop receiving requests from them. So, we must consider their session active and as a contributor to the total active users count until it is older than the session duration limit. For waiting rooms that have a high session duration value configured, a user might navigate to the site for a small duration of time but contribute to the total active users count for up to the configured session duration even after they have stopped accessing the application. This can cause decreased throughput and longer wait times for users in the queue.
\n \n
Introducing Session Revocation
\n
\n \n \n
\n With the Session Revocation feature, we now allow origins to return a command to the waiting room via an HTTP header (Cf-Waiting-Room-Command
) to notify the Waiting Room to revoke the user session associated with the current response. This command removes the current user’s session and decreases the number of total active users for the bucket the session was last tracked in. This allows origins to terminate a user’s session early without needing to wait for the session to expire naturally.
\n \n \n This can improve the throughput of waiting rooms in front of applications which have a dynamic user flow where the session duration is set very high to account for users who send infrequent requests to the application.
To set up session revocation in your waiting room, in the user session settings section in the configuration, check the “Allow session termination via origin commands” box. You must also configure your origin to return a session revocation HTTP header (Cf-Waiting-Room-Command: revoke
) on the response when you want the session associated with that response to be revoked. For more information on how to do this, refer to our developer documentation.
\n \n \n Enable session revocation in the user session settings configuration
In Waiting Room Analytics, you can view the number of sessions revoked per minute. The sessionsRevoked
field is the count of how many sessions were revoked in that minute in the analytics GraphQL API.
In summary, Waiting Room Turnstile Integration and Session Revocation work together to enhance both security and user experience. The addition of a Turnstile challenge in the Waiting Room helps identify and block bots, ensuring that legitimate users don’t face unnecessary delays. Meanwhile, the Session Revocation feature optimizes resource usage by allowing you to end user sessions after key actions, like completing a purchase, freeing up space for other users.
Together, these features successfully increase throughput and reduce wait times, providing a faster, more efficient experience for your customers. For more information on these features, check out our developer documentation.
"],"published_at":[0,"2025-03-03T14:00+00:00"],"updated_at":[0,"2025-03-03T14:00:02.508Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/28gU1SRoYvxSYYqcc8y4Jj/51403d2e79361c5de72ec1ce8fec6eb6/image3.png"],"tags":[1,[[0,{"id":[0,"79jLdGrSe87cix3F6A5Snu"],"name":[0,"Waiting Room"],"slug":[0,"waiting-room"]}],[0,{"id":[0,"6QktrXeEFcl4e2dZUTZVGl"],"name":[0,"Product News"],"slug":[0,"product-news"]}],[0,{"id":[0,"36Dg2NwTgUHhrlE0FRpSdJ"],"name":[0,"Application Services"],"slug":[0,"application-services"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Rachel Wyatt "],"slug":[0,"rachel-wyatt"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6TB9NR11LZyQzqyJeNMtCT/086fce543dcc90dc06c95cf217013864/IMG_9416.png"],"location":[0],"website":[0],"twitter":[0],"facebook":[0],"publiclyIndex":[0,true]}],[0,{"name":[0,"Piper McCorkle"],"slug":[0,"piper"],"bio":[0,"I'm a systems engineer on the Waiting Room and Healthchecks team."],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1MB0LDvo5MEig1PQVIfrj0/bf367b8284422795eba838c6bc3f2d50/piper.jpg"],"location":[0,"Brenham, TX"],"website":[0,null],"twitter":[0,null],"facebook":[0,null],"publiclyIndex":[0,true]}],[0,{"name":[0,"Brad Swenson"],"slug":[0,"brad-swenson"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7MI68h0VhisnSsU6VXnxuQ/33246f61dc417e7da8f566b4e6673da1/me.png"],"location":[0],"website":[0],"twitter":[0],"facebook":[0],"publiclyIndex":[0,true]}]]],"meta_description":[0,"Cloudflare Waiting Room is improving the user experience through the addition of Turnstile and Session Revocation, keeping wait times low and protecting against bot traffic.\n"],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/banish-bots-from-your-waiting-room-and-improve-wait-times-for-real-users"],"metadata":[0,{"title":[0,"Banish bots from your Waiting Room and improve wait times for real users"],"description":[0,"Cloudflare Waiting Room is improving the user experience through the addition of Turnstile and Session Revocation, keeping wait times low and protecting against bot traffic."],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1sqPkvTJbdULutVdXj3mzh/c2c218c46bbcf289eabb1ced8a232465/OG_Share_2024__1_.png"]}],"publicly_index":[0,true]}]]],"locale":[0,"en-us"],"translations":[0,{"posts.by":[0,"By"],"footer.gdpr":[0,"GDPR"],"lang_blurb1":[0,"This post is also available in {lang1}."],"lang_blurb2":[0,"This post is also available in {lang1} and {lang2}."],"lang_blurb3":[0,"This post is also available in {lang1}, {lang2} and {lang3}."],"footer.press":[0,"Press"],"header.title":[0,"The Cloudflare Blog"],"search.clear":[0,"Clear"],"search.filter":[0,"Filter"],"search.source":[0,"Source"],"footer.careers":[0,"Careers"],"footer.company":[0,"Company"],"footer.support":[0,"Support"],"footer.the_net":[0,"theNet"],"search.filters":[0,"Filters"],"footer.our_team":[0,"Our team"],"footer.webinars":[0,"Webinars"],"page.more_posts":[0,"More posts"],"posts.time_read":[0,"{time} min read"],"search.language":[0,"Language"],"footer.community":[0,"Community"],"footer.resources":[0,"Resources"],"footer.solutions":[0,"Solutions"],"footer.trademark":[0,"Trademark"],"header.subscribe":[0,"Subscribe"],"footer.compliance":[0,"Compliance"],"footer.free_plans":[0,"Free plans"],"footer.impact_ESG":[0,"Impact/ESG"],"posts.follow_on_X":[0,"Follow on X"],"footer.help_center":[0,"Help center"],"footer.network_map":[0,"Network Map"],"header.please_wait":[0,"Please Wait"],"page.related_posts":[0,"Related posts"],"search.result_stat":[0,"Results {search_range} of {search_total} for {search_keyword}"],"footer.case_studies":[0,"Case Studies"],"footer.connect_2024":[0,"Connect 2024"],"footer.terms_of_use":[0,"Terms of Use"],"footer.white_papers":[0,"White Papers"],"footer.cloudflare_tv":[0,"Cloudflare TV"],"footer.community_hub":[0,"Community Hub"],"footer.compare_plans":[0,"Compare plans"],"footer.contact_sales":[0,"Contact Sales"],"header.contact_sales":[0,"Contact Sales"],"header.email_address":[0,"Email Address"],"page.error.not_found":[0,"Page not found"],"footer.developer_docs":[0,"Developer docs"],"footer.privacy_policy":[0,"Privacy Policy"],"footer.request_a_demo":[0,"Request a demo"],"page.continue_reading":[0,"Continue reading"],"footer.analysts_report":[0,"Analyst reports"],"footer.for_enterprises":[0,"For enterprises"],"footer.getting_started":[0,"Getting Started"],"footer.learning_center":[0,"Learning Center"],"footer.project_galileo":[0,"Project Galileo"],"pagination.newer_posts":[0,"Newer Posts"],"pagination.older_posts":[0,"Older Posts"],"posts.social_buttons.x":[0,"Discuss on X"],"search.icon_aria_label":[0,"Search"],"search.source_location":[0,"Source/Location"],"footer.about_cloudflare":[0,"About Cloudflare"],"footer.athenian_project":[0,"Athenian Project"],"footer.become_a_partner":[0,"Become a partner"],"footer.cloudflare_radar":[0,"Cloudflare Radar"],"footer.network_services":[0,"Network services"],"footer.trust_and_safety":[0,"Trust & Safety"],"header.get_started_free":[0,"Get Started Free"],"page.search.placeholder":[0,"Search Cloudflare"],"footer.cloudflare_status":[0,"Cloudflare Status"],"footer.cookie_preference":[0,"Cookie Preferences"],"header.valid_email_error":[0,"Must be valid email."],"search.result_stat_empty":[0,"Results {search_range} of {search_total}"],"footer.connectivity_cloud":[0,"Connectivity cloud"],"footer.developer_services":[0,"Developer services"],"footer.investor_relations":[0,"Investor relations"],"page.not_found.error_code":[0,"Error Code: 404"],"search.autocomplete_title":[0,"Insert a query. Press enter to send"],"footer.logos_and_press_kit":[0,"Logos & press kit"],"footer.application_services":[0,"Application services"],"footer.get_a_recommendation":[0,"Get a recommendation"],"posts.social_buttons.reddit":[0,"Discuss on Reddit"],"footer.sse_and_sase_services":[0,"SSE and SASE services"],"page.not_found.outdated_link":[0,"You may have used an outdated link, or you may have typed the address incorrectly."],"footer.report_security_issues":[0,"Report Security Issues"],"page.error.error_message_page":[0,"Sorry, we can't find the page you are looking for."],"header.subscribe_notifications":[0,"Subscribe to receive notifications of new posts:"],"footer.cloudflare_for_campaigns":[0,"Cloudflare for Campaigns"],"header.subscription_confimation":[0,"Subscription confirmed. Thank you for subscribing!"],"posts.social_buttons.hackernews":[0,"Discuss on Hacker News"],"footer.diversity_equity_inclusion":[0,"Diversity, equity & inclusion"],"footer.critical_infrastructure_defense_project":[0,"Critical Infrastructure Defense Project"]}],"localesAvailable":[1,[[0,"ko-kr"]]],"footerBlurb":[0,"Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.
Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.
To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions."]}" ssr client="load" opts='{"name":"Post","value":true}' await-children>Crawler Hints Update: Cloudflare Supports IndexNow and Announces General Availability
2021-10-18


5 min read
This post is also available in
한국어.
In the midst of the hottest summer on record, Cloudflare held its first ever Impact Week. We announced a variety of products and initiatives that aim to make the Internet and our planet a better place, with a focus on environmental, social, and governance projects. Today, we’re excited to share an update on Crawler Hints, an initiative announced during Impact Week. Crawler Hints is a service that improves the operating efficiency of approximately 45% of the Internet traffic that comes from web crawlers and bots.
Crawler Hints achieves this efficiency improvement by ensuring that crawlers get information about what they’ve crawled previously and if it makes sense to crawl a website again.
Today we are excited to announce two updates for Crawler Hints:
The first: Crawler Hints now supports IndexNow, a new protocol that allows websites to notify search engines whenever content on their website content is created, updated, or deleted. By collaborating with Microsoft and Yandex, Cloudflare can help improve the efficiency of their search engine infrastructure, customer origin servers, and the Internet at large.
The second: Crawler Hints is now generally available to all Cloudflare customers for free. Customers can benefit from these more efficient crawls with a single button click. If you want to enable Crawler Hints, you can do so in the Cache Tab of the Dashboard.
What problem does Crawler Hints solve?
Crawlers help make the Internet work. Crawlers are automated services that travel the Internet looking for… well, whatever they are programmed to look for. To power experiences that rely on indexing content from across the web, search engines and similar services operate massive networks of bots that crawl the Internet to identify the content most relevant to a user query. But because content on the web is always changing, and there is no central clearinghouse for when these changes happen on websites, search engine crawlers have a Sisyphean task. They must continuously wander the Internet, making guesses on how frequently they should check a given site for updates to its content.
Companies that run search engines have worked hard to make the process as efficient as possible, pushing the state-of-the-art for crawl cadence and infrastructure efficiency. But there remains one clear area of waste: excessive crawl.
At Cloudflare, we see traffic from all the major search crawlers, and have spent the last year studying how often these bots revisit a page that hasn't changed since they last saw it. Every one of these visits is a waste. And, unfortunately, our observation suggests that 53% of this crawler traffic is wasted.
With Crawler Hints, we expect to make this task a bit more tractable by providing an additional heuristic to the people who run these crawlers. This will allow them to know when content has been changed or added to a site instead of relying on preferences or previous changes that might not reflect the true change cadence for a site. Crawler Hints aims to increase the proportion of relevant crawls and limit crawls that don’t find fresh content, improving customer experience and reducing the need for repeated crawls.
Cloudflare sits in a unique position on the Internet to help give crawlers hints about when they should recrawl a site. Don’t knock on a website’s door every 30 seconds to see if anything is new when Cloudflare can proactively tell your crawler when it’s the best time to index new or changed content. That’s Crawler Hints in a nutshell!
If you want to learn more about Crawler Hints, see the original blog.
IndexNow is a standard that was written by Microsoft and Yandex search engines. The standard aims to provide an efficient manner of signaling to search engines and other crawlers for when they should crawl content. Cloudflare’s Crawler Hints now supports IndexNow.
In its simplest form, IndexNow is a simple ping so that search engines know that a URL and its content has been added, updated, or deleted, allowing search engines to quickly reflect this change in their search results.- www.indexnow.org
By enabling Crawler Hints on your website, with the simple click of a button, Cloudflare will take care of signaling to these search engines when your content has changed via the IndexNow protocol. You don’t need to do anything else!
What does this mean for search engine operators? With Crawler Hints you’ll receive a near real-time, pushed feed of change events of Cloudflare websites (that have opted in). This, in turn, will dramatically improve not just the quality of your results, but also the energy efficiency of running your bots.
Collaborating with Industry leaders
Cloudflare is in a unique position to have a sizable portion of the Internet proxied behind us. As a result, we are able to see trends in the way bots access web resources. That visibility allows us to be proactive about signaling which crawls are required vs. not. We are excited to work with partners to make these insights useful to our customers. Search engines are key constituents in this equation. We are happy to collaborate and share this vision of a more efficient Internet with Microsoft Bing, and Yandex. We have been testing our interaction via IndexNow with Bing and Yandex for months with some early successes.
This is just the beginning. Crawler Hints is a continuous process that will require working with more and more partners to improve Internet efficiency more generally. While this may take time and participation from other key parts of the industry, we are open to collaborate with any interested participant who relies on crawling to power user experiences.
“The cache data from CDNs is a really valuable signal for content freshness. Cloudflare, as one of the top CDNs, is key in the adoption of IndexNow to become an industry-wide standard with a large portion of the internet actually using it. Cloudflare has built a really easy 1-click button for their users to start using it right away. Cloudflare’s mission of helping build a better Internet resonates well with why I started IndexNow i.e. to build a more efficient and effective Search.”- Fabrice Canel, Principal Program Manager
“Yandex is excited to join IndexNow as part of our long-term focus on sustainability. We have been working with the Cloudflare team in early testing to incorporate their caching signals in our crawling mechanism via the IndexNow API. The results are great so far.”- Maxim Zagrebin, Head of Yandex Search
"DuckDuckGo is supportive of anything that makes search more environmentally friendly and better for end users without harming privacy. We're looking forward to working with Cloudflare on this proposal."- Gabriel Weinberg, CEO and Founder
How do Cloudflare customers benefit?
Crawler Hints doesn’t just benefit search engines. For our customers and origin owners, Crawler Hints will ensure that search engines and other bot-powered experiences will always have the freshest version of your content, translating into happier users and ultimately influencing search rankings. Crawler Hints will also mean less traffic hitting your origin, improving resource consumption. Moreover, your site performance will be improved as well: your human customers will not be competing with bots!
And for Internet users? When you interact with bot-fed experiences — which we all do every day, whether we realize it or not, like search engines or pricing tools — these will now deliver more useful results from crawled data, because Cloudflare has signaled to the owners of the bots the moment they need to update their results.
How can I enable Crawler Hints for my website?
Crawler Hints is free to use for all Cloudflare customers and promises to revolutionize web efficiency. If you’d like to see how Crawler Hints can benefit how your website is indexed by the worlds biggest search engines, please feel free to opt-into the service:
Sign in to your Cloudflare Account.
In the dashboard, navigate to the Cache tab.
Click on the Configuration section.
Locate the Crawler Hints sign up card and enable. It's that easy.
Once you’ve enabled it, we will begin sending hints to search engines about when they should crawl particular parts of your website. Crawler Hints holds tremendous promise to improve the efficiency of the Internet.
We’re thrilled to collaborate with industry leaders Microsoft Bing, and Yandex to bring IndexNow to Crawler Hints, and to bring Crawler Hints to a wide audience in general availability. We look forward to working with additional companies who run crawlers to help make this process more efficient for the whole Internet.
Cloudflare's connectivity cloud protects
entire corporate networks, helps customers build
Internet-scale applications efficiently, accelerates any
website or Internet application,
wards off DDoS attacks, keeps
hackers at bay, and can help you on
your journey to Zero Trust.
Visit
1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.
To learn more about our mission to help build a better Internet,
start here. If you're looking for a new career direction, check out
our open positions.
Crawler HintsProduct NewsSpeed & ReliabilitySEO