Simon Willison’s Weblog

Subscribe
Atom feed for websockets

19 items tagged “websockets”

2024

Zero-latency SQLite storage in every Durable Object (via) Kenton Varda introduces the next iteration of Cloudflare's Durable Object platform, which recently upgraded from a key/value store to a full relational system based on SQLite.

For useful background on the first version of Durable Objects take a look at Cloudflare's durable multiplayer moat by Paul Butler, who digs into its popularity for building WebSocket-based realtime collaborative applications.

The new SQLite-backed Durable Objects is a fascinating piece of distributed system design, which advocates for a really interesting way to architect a large scale application.

The key idea behind Durable Objects is to colocate application logic with the data it operates on. A Durable Object comprises code that executes on the same physical host as the SQLite database that it uses, resulting in blazingly fast read and write performance.

How could this work at scale?

A single object is inherently limited in throughput since it runs on a single thread of a single machine. To handle more traffic, you create more objects. This is easiest when different objects can handle different logical units of state (like different documents, different users, or different "shards" of a database), where each unit of state has low enough traffic to be handled by a single object

Kenton presents the example of a flight booking system, where each flight can map to a dedicated Durable Object with its own SQLite database - thousands of fresh databases per airline per day.

Each DO has a unique name, and Cloudflare's network then handles routing requests to that object wherever it might live on their global network.

The technical details are fascinating. Inspired by Litestream, each DO constantly streams a sequence of WAL entries to object storage - batched every 16MB or every ten seconds. This also enables point-in-time recovery for up to 30 days through replaying those logged transactions.

To ensure durability within that ten second window, writes are also forwarded to five replicas in separate nearby data centers as soon as they commit, and the write is only acknowledged once three of them have confirmed it.

The JavaScript API design is interesting too: it's blocking rather than async, because the whole point of the design is to provide fast single threaded persistence operations:

let docs = sql.exec(`
  SELECT title, authorId FROM documents
  ORDER BY lastModified DESC
  LIMIT 100
`).toArray();

for (let doc of docs) {
  doc.authorName = sql.exec(
    "SELECT name FROM users WHERE id = ?",
    doc.authorId).one().name;
}

This one of their examples deliberately exhibits the N+1 query pattern, because that's something SQLite is uniquely well suited to handling.

The system underlying Durable Objects is called Storage Relay Service, and it's been powering Cloudflare's existing-but-different D1 SQLite system for over a year.

I was curious as to where the objects are created. According to this (via Hacker News):

Durable Objects do not currently change locations after they are created. By default, a Durable Object is instantiated in a data center close to where the initial get() request is made. [...] To manually create Durable Objects in another location, provide an optional locationHint parameter to get().

And in a footnote:

Dynamic relocation of existing Durable Objects is planned for the future.

where.durableobjects.live is a neat site that tracks where in the Cloudflare network DOs are created - I just visited it and it said:

This page tracks where new Durable Objects are created; for example, when you loaded this page from Half Moon Bay, a worker in San Jose, California, United States (SJC) created a durable object in San Jose, California, United States (SJC).

Where Durable Objects Live.    Created by the wonderful Jed Schmidt, and now maintained with ❤️ by Alastair. Source code available on Github.    Cloudflare Durable Objects are a novel approach to stateful compute based on Cloudflare Workers. They aim to locate both compute and state closest to end users.    This page tracks where new Durable Objects are created; for example, when you loaded this page from Half Moon Bay, a worker in San Jose, California, United States (SJC) created a durable object in Los Angeles, California, United States (LAX).    Currently, Durable Objects are available in 11.35% of Cloudflare PoPs.    To keep data fresh, this application is constantly creating/destroying new Durable Objects around the world. In the last hour, 394,046 Durable Objects have been created(and subsequently destroyed), FOR SCIENCE!    And a map of the world showing lots of dots.

# 13th October 2024, 10:26 pm / software-architecture, sqlite, cloudflare, litestream, scaling, websockets

openai/openai-realtime-console. I got this OpenAI demo repository working today - it's an extremely easy way to get started playing around with the new Realtime voice API they announced at DevDay last week:

cd /tmp
git clone https://github.com/openai/openai-realtime-console
cd openai-realtime-console
npm i
npm start

That starts a localhost:3000 server running the demo React application. It asks for an API key, you paste one in and you can start talking to the web page.

The demo handles voice input, voice output and basic tool support - it has a tool that can show you the weather anywhere in the world, including panning a map to that location. I tried adding a show_map() tool so I could pan to a location just by saying "Show me a map of the capital of Morocco" - all it took was editing the src/pages/ConsolePage.tsx file and hitting save, then refreshing the page in my browser to pick up the new function.

Be warned, it can be quite expensive to play around with. I was testing the application intermittently for only about 15 minutes and racked up $3.87 in API charges.

# 9th October 2024, 12:38 am / nodejs, javascript, openai, websockets, generative-ai, ai, llms, react

OpenAI DevDay: Let’s build developer tools, not digital God

Visit OpenAI DevDay: Let’s build developer tools, not digital God

I had a fun time live blogging OpenAI DevDay yesterday—I’ve now shared notes about the live blogging system I threw other in a hurry on the day (with assistance from Claude and GPT-4o). Now that the smoke has settled a little, here are my impressions from the event.

[... 2,090 words]

SQL Injection Isn’t Dead: Smuggling Queries at the Protocol Level (via) PDF slides from a presentation by Paul Gerste at DEF CON 32. It turns out some databases have vulnerabilities in their binary protocols that can be exploited by carefully crafted SQL queries.

Paul demonstrates an attack against PostgreSQL (which works in some but not all of the PostgreSQL client libraries) which uses a message size overflow, by embedding a string longer than 4GB (2**32 bytes) which overflows the maximum length of a string in the underlying protocol and writes data to the subsequent value. He then shows a similar attack against MongoDB.

The current way to protect against these attacks is to ensure a size limit on incoming requests. This can be more difficult than you may expect - Paul points out that alternative paths such as WebSockets might bypass limits that are in place for regular HTTP requests, plus some servers may apply limits before decompression, allowing an attacker to send a compressed payload that is larger than the configured limit.

How Web Apps Handle Large Payloads. Potential bypasses: - Unprotected endpoints - Compression - WebSockets (highlighted) - Alternate body types - Incrementation.  Next to WebSockets:  - Compression support - Large message size - Many filters don't apply

# 12th August 2024, 3:36 pm / postgresql, sql-injection, security, mongodb, websockets, http

2023

See this page fetch itself, byte by byte, over TLS (via) George MacKerron built a TLS 1.3 library in TypeScript and used it to construct this amazing educational demo, which performs a full HTTPS request for its own source code over a WebSocket and displays an annotated byte-by-byte representation of the entire exchange. This is the most useful illustration of how HTTPS actually works that I’ve ever seen.

# 10th May 2023, 1:58 pm / tls, http, encryption, explorables, websockets, https

Quicker serverless Postgres connections. Neon provide “serverless PostgreSQL”—autoscaling, managed PostgreSQL optimized for use with serverless hosting environments. A neat capability they provide is the ability to connect to a PostgreSQL server via WebSockets, which means their database can be used from environments such as Cloudflare workers which don’t have the ability to use a standard TCP database connection. This article describes some clever tricks they used to make establishing new connections via WebSockets more efficient, using the least possible number of network round-trips.

# 28th March 2023, 10:09 pm / postgresql, websockets

2019

How Zoom’s web client avoids using WebRTC (via) It turns out video conferencing app Zoom uses their own WebAssembly compiled video and audio codecs and transmits H264 over WebSockets.

# 18th April 2019, 6:20 pm / websockets, webassembly

websocketd (via) Delightfully clever piece of design: “It’s like CGI, twenty years later, for WebSockets”. Simply run “websocketd --port=8080 my-program” and it will start up a WebSocket server on port 8080 and fire up a new process running your script every time it sees a new WebSocket connection. Standard in and standard out are automatically hooked up to the socket connection. Since it spawns a new process per connection this won’t work well with thousands of connections but for smaller scale projects it’s an excellent addition to the toolbok—and since it’s written in Go there are pre-compiled binaries available for almost everything.

# 26th January 2019, 2:38 am / websockets, go

2018

Being fast and light: Using binary data to optimise libraries on the client and the server. (via) Ada Rose Cannon provides a detailed introduction to ArrayBuffers in JavaScript and describes how she used them for a custom binary protocol to sync the state of 170 Virtual Reality users in the same venue without bringing down the network.

# 13th March 2018, 2:34 pm / javascript, websockets

Channels 2.0. Andrew just shipped Channels 2.0—a major rewrite and redesign of the Channels project he started back in 2014. Channels brings async to Django, providing a logical, standardized way of supporting things like WebSockets and asynchronous execution on top of a Django application. Previously it required you to run a separate Twisted server and redis/RabbitMQ queue, but thanks to Python 3 async everything can now be deployed as a single process. And the new ASGI spec means its turtles all the way down! Everything from URL routing to view functions to middleware can be composed together using the same ASGI interface.

# 2nd February 2018, 6:19 pm / async, python3, django, python, andrew-godwin, websockets

Domains Search for Web: Instant, Serverless & Global (via) The team at Zeit are pioneering a whole bunch of fascinating web engineering architectural patterns. Their new domain name autocomplete search uses Next.js and server-side rendering on first load, then switches to client-side rendering from then on. It can then load results asynchronously over a custom WebSocket protocol as the microservices on the backend finish resolving domain availability from the various different TLD providers.

# 26th January 2018, 1:14 am / zeit-now, microservices, websockets

2017

Live htop. Neat, simplest-thing-that-could-possibly-work implementation of a tool that continually pipes the output of the htop command to a browser over a WebSocket. The htopgen.sh scripts loops every 2 seconds, runs htop, pipes it through a utility to convert the output to HTML and writes that to a file. Then the server.js Node.js script watches for changes to that file and pipes the entire file contents to the browser via socket.io. The index.html page in the browser subscribes to the WebSocket and updates the entire page using innerHTML every time it receives an event.

# 1st November 2017, 6:07 pm / websockets, nodejs

2016

What’s the cheapest or free stack solution to deploy and experiment with a realtime application in 2016?

Heroku have a good free tier, and comprehensive support for deploying both Python and Node.js. If you are mainly interested in realtime I would suggest starting out with Node.js on Heroku. Depending on the complexity of your project you might even be able to use raw Node.js without adding something like Express.

[... 81 words]

2012

What are the differences between node.js and websockets?

This is like asking “what’s the difference between PHP and HTTP”. Node.js is a technology framework you write code in. WebSockets is a protocol which can be implemented using a technology framework. You can use Node.js to implement the server-side aspect of WebSockets.

[... 57 words]

2010

Mongrel2 is “Self-Hosting”. Zed Shaw’s Mongrel2 is shaping up to be a really interesting project. “A web server simply written in C that loves all languages equally”, the two most interesting new ideas are the ability to handle HTTP, Flash Sockets and WebSockets all on the same port (thanks to an extension to the Mongrel HTTP parser that can identify all three protocols) and the ability to hook Mongrel2 up to the backend servers using either TCP/IP or ZeroMQ. I’m guessing this means Mongrel2 could hold an HTTP request open, fire off some messages and wait for various backends to send messages back to construct the response, making async processing just as easy as a regular blocking request/response cycle.

# 17th June 2010, 8:11 pm / async, c, http, mongrel2, webserver, zed-shaw, zeromq, recovered, websockets

Realtime Election Tweets. Jay Caines-Gooby’s realtime election tweet service, using Node.js, nginx and WebSocket with a Flash fallback.

# 6th May 2010, 9:20 pm / comet, election, flash, javascript, jay-caines-gooby, node, realtime, twitter, recovered, websockets

2009

Web Sockets in Tornado. Bret Taylor has a simple class making it trivial to experiment with the Web Sockets protocol (now shipping in Chrome) using the scalable Tornado application server. He also raises the million dollar question: what will existing load balancers and proxies make of the new protocol?

# 31st December 2009, 11:54 am / brettaylor, tornado, python, websockets, comet, chrome

Real time online activity monitor example with node.js and WebSocket. A neat exploration of Node.js—first hooking a “tail -f” process up to an HTTP push stream, then combining that with HTML 5 WebSockets to achieve reliable streaming.

# 8th December 2009, 11:07 pm / node, javascript, comet, html5, websockets, http

2008

Independence Day: HTML5 WebSocket Liberates Comet From Hacks. The HTML5 spec now includes WebSocket, a TCP-style persistent socket mechanism between client and server using an HTTP handshake to work around firewalls. The Orbited comet implementation provides a WebSocket compatible API to existing browsers today, and can also act as a firewall/proxy between WebSocket and regular TCP sockets, allowing browsers to talk to things like XMPP servers using Orbited to bridge the gap.

# 4th July 2008, 9:54 am / orbited, comet, html5, sockets, tcpsocket, xmpp, websockets