We are excited to announce that the HTMLRewriter API for Cloudflare Workers is now GA! You can get started today by checking out our documentation, or trying out our tutorial for localizing your site with the HTMLRewriter.
Want to know how it works under the hood? We are excited to tell you everything you wanted to know but were afraid to ask, about building a streaming HTML parser on the edge; read about it in part 1 (and stay tuned for part two coming tomorrow!).
Faster, more scalable applications at the edge
The HTMLRewriter can help solve two big problems web developers face today: making changes to the HTML, when they are hard to make at the server level, and making it possible for HTML to live on the edge, closer to the user — without sacrificing dynamic functionality.
Since the introduction of Workers, Workers have helped customers regain control where control either wasn’t provided, or very hard to obtain at the origin level. Just like Workers can help you set CORS headers at the middleware layer, between your users and the origin, the HTMLRewriter can assist with things like URL rewrites (see the example below!).
Back in January, we introduced the Cache API, giving you more control than ever over what you could store in our edge caches, and how. By helping users have closer control of the cache, users could push experiences that were previously only able to live at the origin server to the network edge. A great example of this is the ability to cache POST requests, where previously only GET requests were able to benefit from living on the edge.
Later this year, during Birthday Week, we announced Workers Sites, taking the idea of content on the edge one step further, and allowing you to fully deploy your static sites to the edge.
We were not the first to realize that the web could benefit from more content being accessible from the CDN level. The motivation behind the resurgence of static sites is to have a canonical version of HTML that can easily be cached by a CDN.
This has been great progress for static content on the web, but what about so much of the content that’s almost static, but not quite? Imagine an e-commerce website: the items being offered, and the promotions are all the same — except that pesky shopping cart on the top right. That small difference means the HTML can no longer be cached, and sent to the edge. That tiny little number of items in the cart requires you to travel all the way to your origin, and make a call to the database, for every user that visits your site. This time of year, around Black Friday, and Cyber Monday, this means you have to make sure that your origin, and database are able to scale to the maximum load of all the excited shoppers eager to buy presents for their loved ones.
Edge-side JAMstack with HTMLRewriter
In September, we introduced the HTMLRewriter API in beta to the Cloudflare Workers runtime — a streaming parser API to allow developers to develop fully dynamic applications on the edge. HTMLRewriter is a jQuery-like experience inside your Workers application, allowing you to manipulate the DOM using selectors and handlers. The same way jQuery revolutionized front-end development, we’re excited for the HTMLRewriter to change how we think about architecting dynamic applications.
There are a few advantages to rewriting HTML on the edge, rather than on the client. First, updating content client-side introduces a tradeoff between the performance and consistency of the application — that is, if you want to, update a page to a logged in version client side, the user will either be presented with the static version first, and witness the page update (resulting in an inconsistency known as a flicker), or the rendering will be blocked until the customization can be fully rendered. Having the DOM modifications made edge-side provides the benefits of a scalable application, without the degradation of user experience. The second problem is the inevitable client-side bloat. Most of the world today is connected to the internet on a mobile phone with significantly less powerful CPU, and on shaky last-mile connections where each additional API call risks never making it back to the client. With the HTMLRewriter within Cloudflare Workers, the API calls for dynamic content, (or even calls directly to Workers KV for user-specific data) can be made from the Worker instead, on a much more powerful machine, over much more resilient backbone connections.
Here’s what the developers at Happy Cog have to say about HTMLRewriter:
— Matt Weinberg, – President of Development and Technology, Happy Cog
HTMLRewriter in Action
Since most web developers are familiar with jQuery, we chose to keep the HTMLRewriter very similar, using selectors and handlers to modify content.
Below, we have a simple example of how to use the HTMLRewriter to rewrite the URLs in your application from
The HTMLRewriter, which is instantiated once per Worker, is used here to select the
An element handler responds to any incoming element, when attached using the
.on function of the
Here, our AttributeRewriter will receive every element parsed by the HTMLRewriter instance. We will then use it to rewrite each attribute,
href for the
a element for example, to update the URL in the HTML.
How does the HTMLRewriter work?
How does the HTMLRewriter work under the hood? We’re excited to tell you about it, and have spared no details. Check out the first part of our two-blog post series to learn all about the not-so-brief history of rewriter HTML at Cloudflare, and stay tuned for part two tomorrow!
This content was originally published here.