Skip to content

Latest commit

 

History

History
277 lines (270 loc) · 10.9 KB

slides.mdx

File metadata and controls

277 lines (270 loc) · 10.9 KB

import { Head } from 'mdx-deck' import CodeSurfer from './code-surfer' export { dark as theme } from 'mdx-deck/themes'

<title>Async Iterators - a new future for streams</title>

Async Iterators

a new future for streams


<CodeSurfer title="About Me" code={require('!raw-loader!./intro/about.js').default} steps={[ { notes: "" }, { lines: [2], notes: "Hi, I'm Stephen" }, { lines: [3], notes: "I'm on Github" }, { lines: [4], notes: "And Twitter" } ]} />

- Node.js core contributor
- Started diagnostics WG
- Elastic is always hiring!

import { Appear } from 'mdx-deck'

Agenda

  • Intro to async iterators
  • Use cases
  • Pitfalls
  • Performance

What are async iterators?


<CodeSurfer title="The basics" code={require('!raw-loader!./intro/interval.js').default} steps={[ { range: [7, 12], notes: "This is an async generator function" }, { tokens: { 7: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] }, notes: "It has a special signature" }, { lines: [9], notes: "It can await async things" }, { lines: [10], notes: "And it can yield values" }, { tokens: { 14: [10, 11, 12, 13] }, notes: "It returns an async iterator" }, { range: [14, 16], notes: "Which can be used in a for await loop" }, { notes: "" }, ]} />

<CodeSurfer title="Custom async iterators" code={require('!raw-loader!./intro/custom-async-iterator.js').default} steps={[ { notes: "Async iterators and generators are just an interface" }, { lines: [2, 3, 9, 10], notes: "An iterator has a promise-returning next function" }, { range: [5, 8], notes: "It should resolve to an iterator item object" }, { range: [12, 16], notes: "A generator has one function returning an iterator" }, { tokens: { 13: [0,1,2,3,4,5,6,7,8] }, notes: "It requires a special symbol property name" }, ]} />

Let's see some examples


<CodeSurfer title="Node.js streams" code={require('!raw-loader!./examples/streams.js').default} steps={[ { notes: "Did you know streams are async iterators?" }, { lines: [10], notes: "If you have a stream..." }, { range: [3, 5], notes: "...you can consume it with a loop" }, { range: [1, 7], notes: "Concatenating a stream has never been easier!" }, ]} />

<CodeSurfer title="Stream transforms" code={require('!raw-loader!./examples/stream-transforms.js').default} steps={[ { notes: "" }, { lines: [1, 10, 16], notes: "The async-iterator-pipe module is super helpful" }, { lines: [11], notes: "It takes a source stream..." }, { lines: [12, 13, 14], notes: "...any number of transforms..." }, { lines: [15], notes: "...and a target" }, { lines: [11], notes: "The source must be an async iterable" }, { range: [12, 15], notes: "The rest, streams or async generator functions" }, { range: [4, 8], notes: "Functions transform an async iterable to a result" } ]} />

<CodeSurfer title="Now something more complex" code={require('!raw-loader!./examples/stream-line-split.js').default} steps={[ { range: [50, 56], notes: "Let's try converting a csv to new-line delimited JSON" }, { lines: [51], notes: "First, get a read stream of the CSV file" }, { range: [3, 21], notes: "Then we split the stream by line breaks" }, { lines: [4, 6, 7, 16], notes: "Append chunks to a buffer" }, { range: [9, 15], notes: "While the buffer contains a line break, slice and yield" }, { range: [18, 20], notes: "Remember to emit any remaining data" }, { range: [23, 41], notes: "Now that we've split to lines, parse csv lines" }, { range: [24, 32], lines: [40], notes: "The first line contains the column names" }, { range: [34, 40], lines: [26, 27], notes: "Subsequent lines contain data" }, { range: [43, 47], notes: "Lastly, stringify the JSON..." }, { lines: [55], notes: "...and pass it to the target stream" }, ]} />

<CodeSurfer title="async-iterator-pipe" code={require('!raw-loader!./examples/async-iterator-pipe.js').default} steps={[ { notes: "What is this pipe magic?" }, { range: [1, 6], notes: "Here we pipe an iterable into a stream..." }, { range: [2, 4], notes: "...by writing each item to the stream..." }, { lines: [5], notes: "...then ending the stream" }, { range: [14, 16], notes: "Iterator function targets can be called directly" }, { range: [9, 18], notes: "Some type detection allows supporting both types" }, { range: [21, 23], notes: "Multiple stages can be connected with a reduce" }, ]} />

<CodeSurfer title="HTTP servers" code={require('!raw-loader!./examples/http-server.js').default} steps={[ { notes: "HTTP requests can be async iterators too" }, { lines: [2], notes: "There's a module for that" }, { range: [4, 5], notes: "Create an HTTP server..." }, { range: [7, 10], notes: "...and convert 'request' events into iterable values" }, ]} />

<CodeSurfer title="How does http-iterator work?" code={require('!raw-loader!./examples/http-iterator.js').default} steps={[ { notes: "http-iterator is fairly simple too" }, { lines: [1], notes: "channel-surfer provides go-like channels" }, { lines: [4], notes: "Construct a channel" }, { lines: [6, 7, 8], notes: "Give it requests" }, { lines: [10, 11, 12], notes: "Close it when the server closes" }, { lines: [14], notes: "Channels are already async iterators, just return" }, ]} />

<CodeSurfer title="Pagination" code={require('!raw-loader!./examples/pagination.js').default} steps={[ { range: [1, 14], notes: "Say you have some paginated data..." }, { range: [16, 19], notes: "...and an async function to retrieve pages" }, { range: [21, 28], notes: "A naive page iterator might look like this" }, { range: [25, 27], notes: "It yields pages until loadPage returns falsy" }, { range: [2, 6], tokens: { 7: [1] }, notes: "Remember, a page is an object with an items list" }, { range: [30, 36], notes: "This flattens pages to an item iterator" }, { range: [38, 43], notes: "Let's use it now..." }, { lines: [38], notes: "Make a page iterator..." }, { lines: [39], notes: "...wrap it in a page-to-items transformer..." }, { range: [41, 43], notes: "...then consume the result" }, ]} />

<CodeSurfer title="Batching" code={require('!raw-loader!./examples/batching.js').default} steps={[ { range: [5, 11], notes: "Consider an infinite sequence of async things..." }, { range: [13, 18], notes: "To create a batch, we need a function to get a slice" }, { range: [14, 17], notes: "For of loops consume the iterator, so use while instead" }, { range: [20, 26], notes: "We'll also need to convert the async iterator to an array" }, { range: [28, 35], notes: "Now we can split to batches as an async iterator" }, { lines: [30], notes: "Get an iterator page of the given size" }, { lines: [31], notes: "Convert the iterator to an array" }, { range: [32, 33], notes: "As long as there are items, yield the slice" }, { range: [37, 44], notes: "...and process the batches however we want!" }, { lines: [41], notes: "Here we get numbers in batches of eight." }, { tokens: { 42: [10, 11, 12, 13, 14, 15, 16] }, notes: "Then we take eight pages of batches." }, { lines: [43], notes: "Then we sum each batch" }, ]} />

There are some pitfalls


<CodeSurfer title="For-await fully consumes" code={require('!raw-loader!./examples/batching.js').default} steps={[ { range: [14, 17], notes: "Use while and iterator interface for partial consumption" } ]} />

<CodeSurfer title="Microtasks are higher priority" code={require('!raw-loader!./examples/microtask-priority.js').default} steps={[ { range: [7, 9], lines: [2], notes: "Consider an infinite loop..." }, { range: [2, 5], notes: "...with a timeout escape" }, { range: [1, 10], notes: "This might seem reasonable at first, but it's not." }, { range: [12, 13], notes: "Let's produce numbers until the timeout is reached." }, { range: [15, 19], notes: "Then sum all the numbers produced until the timeout." }, { range: [15, 19], notes: "Did you catch the issue?" }, { tokens: { 13: [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] }, notes: "The immediately available result will create a microtask loop" } ]} />

<CodeSurfer title="Remember dezalgo?" code={require('!raw-loader!./examples/dezalgo.js').default} steps={[ { notes: "Ḥ̶̷͉̋̎ͩ̔̚ͅe̵̮ͫ̀͞ͅ ̂ͮ̌ͩ̉͝͏̙̱̥r̡̙͓̙̲̲̯̞̼̓͐̏̄͆̀ͮ͊̿ệ̸͖̭̤͕͚̬̚ͅt̶ͤ͋͛̑͜͏̠̪͔̲͚̙̦u̡̥̖̍ͮͧͪ̄͡r̵̤̰̥̭̦̲͐̓̐̆͑ň̝̜̭̔̉ͧ̎̎͋̒͘͟͠s̶̬̜̤̳̻͛̑̅ͨ̉ͦ͒ͨ̀̚ͅ,̰̠͇̩͙̋̅̋͊̈́̀ ̲̖̗̉ͯ̎̔̑̆͟w͐͌͆̿ͮͯ҉̢͔̲̻̱͕̻͞ͅi̸̞͖̙͔̲͉̼ͬṯ̼̌̐̀ͩ̈́̐́̓͞h̗̩̦͓̽̌̈̈̋͆̀͗ ̍̔̔̅̐ͥ̓̽̾͏҉̻̲̣̲̮̩ͅc͖̞̰̪͍̍̿͋̈̀ͣ͊ͬͭ͠h̷̜̗͍̺̥ͪ͊ͣͩ̚a̺ͩͤ͗ͨ̂̕͜o̟̖͚̟̗͉̠̗̿͂ͥ͌͑ͤ͂̒ͦ̀͢s̶̮̬̲̲̦̳ͣ̈́ͧ͒̕͞ͅ ̡̪̘̖͐͒ͤͨ̌̈́ͭͣ̐i͆́͆ͫ̄͑ͬ͆̉͏̝̤̹͍͖̩̩̺͜n͔̣̠̳̱͚̘͒̽͑̎ ͙̖̦̣̄̽́̽̎̍̊ͧ͜h͇̻̙̬ͯ̽ͣ̓ͣ́ͤͭ̈́i̞͖̰̿̔̎ͧ́s̨̯̫͍̝̰͉̬ͥ̍̊͐́̍̎̿ ̸̲̗͍̗͉̪͉̙͛̿̋͑̓͒ͥ̚̚͜w̭͖̻̺̬̳͕̗̄̽̃͌â̵̲͔ͭͪ͛͊̿͝ͅk̗̤̣͐̒̿̌͐̐̿̚͘e̸̳͎͇̺ͪ̅̓̿.͂̓̌̎̋͝҉̲͢" }, ]} />

What about performance?

I've heard promises are slow


<CodeSurfer title="First, let's look at streams" code={require('!raw-loader!./performance/upper/stream.js').default} steps={[ { range: [3, 8], notes: "We'll just do a basic uppercasing stream" }, ]} />

<CodeSurfer title="Next, the async iterator" code={require('!raw-loader!./performance/upper/async-iterator.js').default} steps={[ { range: [3, 8], notes: "The async iterator is nearly identical" } ]} />

Let's try something less naive

Converting CSV to JSON


<CodeSurfer title="First, let's look at streams" code={require('!raw-loader!./performance/csv-to-json/stream.js').default} steps={[ { range: [4, 31], notes: "We need a CSV transform" }, { range: [7, 8], notes: "First, get all the unprocessed data" }, { lines: [10, 11, 21, 22, 23], notes: "Then loop while the buffer contains a line break" }, { range: [12, 19], notes: "Split each line and store headers or push an object" }, ]} />

<CodeSurfer title="Next, the async iterator" code={require('!raw-loader!./performance/csv-to-json/async-iterator.js').default} steps={[ { range: [4, 26], notes: "We need a CSV transform" }, { lines: [5, 9], notes: "We need to handle unprocessed data for each chunk" }, { lines: [11, 12, 23, 24, 25], notes: "The line loop should look familiar..." }, { range: [13, 21], notes: "..and even the splitting and yielding is similar" }, ]} />

Go forth

and async iterate!

Slides are at {document.location.origin}