JavaScript Iterables and Iterated

In this post, I explain why Iterables and AsyncIterables in JavaScript are great and why the library I released—iterated—is a great tool to unleash their potential. It is time to move away from only Arrays into Iterables.

Simply said, an Iterable is any data or variable that we can loop. This includes String, Array, Set, and Map. Usually data is in arrays, so we use iterables already "under the hood" even if we don’t know it. For example,when using for..of:

const values = [1, 2, 3]
for (const value of values) {
  console.log(value)
}
looping with for..of works with any iterable.

Javascript introduced iterables a while ago, and since 2017 is widely supported. Other languages like Python had it since 2001, and they are a core part of the language. However, in my experience, developers tend to focus on working only with arrays (e.g., Array.map, Array.filter) instead of shifting their mind and working with iterables, even if iterables have several benefits:

  1. Abstraction over the collection where we contain our data.
  2. Avoiding intermediate data structures, reducing CPU and RAM usage, and allowing infinite and streaming collections.
  3. AsyncIterators, putting promises into iterables to iterate with I/O and API calls.
  4. They are a great fit with pipes.

In this post, I plan to entice you to move from an Array-centric vision to an Iterable-centric one.

Abstraction

The first benefit is abstraction. As we have several types of collections, like Array, String, or Map, iterables allows us to make agnostic functions that can work with any of these collections. Or said in another way, functions that need only something that loops:

function printValues(iterable: Iterable<any>) {
  for (const value in iterable) {
    console.log(value)
  }
}

printValues([1, 2, 3])
printValues('foo bar')
printValues(new Set([1, 2, 3]))
This function accepts any kind of iterables, including Array, String, and Set.

Avoiding intermediate data structures

Transforming arrays of data means in many ways creating intermediate data structures that cost RAM and CPU.

In the following example, Array.map and Array.filter create new arrays, so we now have three arrays: input, x, y.

const input = [1, 2, 3]
const x: Array<number> = input.map(x => x * 2)
const y: Array<number> = input.filter(x => x < 4)
Example of applying map and filter the regular way, which copy the array multiple times.

Iterables do not generate an intermediate data structure:

import it from 'iterated'

const input = [1, 2, 3]
const x: Iterable<number> = it.map(input, x => x * 2)
const y: Iterable<number> = it.filter(x, x => x < 4)
Example of applying map and filter with iterables, which do not generate array copies.

This is because iterables are lazy: they do not iterate until necessary, only on demand. For example, when a function forces them to generate an array.

const result: Array<number> = it.array(y)
Transforming an iterable into an array, creating a single copy.

In sum, with iterables we avoid intermediate data structures, which is crucial when working with big datasets.

Generators and yielding

The same way that arrays and strings are now iterables, functions can be iterables too. Rephrasing it: we can iterate functions. To tell JavaScript that a function can be iterated, we use two things: the yield keyword and an asterisk after function:

function* iterableFunction() {
  yield 1
  yield 2
  yield 3
}

for (const val of iterableFunction()) {
  console.log(val) // prints 1 2 3
}
Example of creating and executing a generator function.

For a function accepting an Iterable (like the printValues we defined earlier) this is still a regular Iterable:

printValues(iterableFunction()) // prints 1 2 3
printValues([1, 2, 3]) // prints 1 2 3
printValues accept any iterable, including iterableFunction().

We call these iterable functions generators.

A function that yields avoids creating intermediate data structures. We can demonstrate this by comparing the implementation of a regular Array.map and a yielding map functions:

function regularOldMap(input: Array, func): Array {
  const newArray = []
  for (const val of input) {
    const result = func(val)
    newArray.push(result)
  }

  // By now we have two arrays full of data: input and newArray
  return newArray
}

function* yieldingMap(input: Iterable, func): Iterable {
  for (const val of input) {
    const result = func(val)
    yield result
  }
}
Implementations of a map function with and without iterables.

Even if both functions achieve the same:

  • The Iterable one allows any type of iterable (i.e. Array, String, Set, function*)
  • The Iterable one avoids creating an intermediate data structure (i.e. newArray).

And thus yieldingMap is superior to regularOldMap.

iterated provides functions such as yieldingMap (i.e., map) to handle iterables.

Finally, thanks to their lazy nature, generator functions enable collections of infinite values, such as computing the fibonacci sequence or streaming information.

AsyncIterators

Converting the yieldingMap example to an async function (e.g., for fetching data) transforms the return from an Iterable to an AsyncIterable:

async function* fetchUsers(usersID: Iterable<UserID>): AsyncIterable<User> {
  for (const userId of usersId) {
    const user = await fetch(`${baseUrl}/${userId}`)
    // do something
    yield user
  }
}
An async generator function that fetches users.

This means that in every loop of AsyncIterable it has to await:

async function doSomething() {
  const usersId = [1, 2, 3]
  for await (const user of fetchUsers(usersId)) {
    // do something
  }
}
Looping the users from the previous example.

The beauty of this approach is that we only loop after fetching a user (i.e., lazily).

We can achieve the same without using AsyncIterable but increasing the code boilerplate:

function fetchUsers(usersID: Array<UserId>): Array<Promise<User>> {
  const users: Array<Promise<User>> = []
  for (const userId of usersId) {
    const user = fetch(`${baseUrl}/${userId}`)
    // do something
    users.push(user)
  }
  return users
}


async function doSomething() {
  const usersId = [1, 2, 3]
  for (const userPromise of fetchUsers(usersId)) {
    const user = await userPromise
    // do something
  }
}
Same functionality than the previous example without using AsyncIterable.

As JavaScript’s async/await is everywhere, having a simple way to handle async calls helps to reduce code and complexity.

AsyncIterables are specially useful when we do not know at the moment of starting the loop when it is going to be done, for example when accessing I/O.

An issue is that the same function cannot handle Iterable and AsyncIterable (because AsyncIterable requires to be inside async functions and for ... of loops), so we partially lose the value of having a function that just works with any kind of collection of data.

iterated fixes this by auto-handling Iterable and AsyncIterable transparently:

const users: AsyncIterable<User> = await fetchUsers([1, 2, 3])
const successfulUsers: AsyncIterable<User> = it.filter(users, ({ success }) => success)
iterated's functions (like filter) accept both Iterables and AsyncIterables

pipes

When processing data, pipes increase legibility by focusing on the operations to do to over the data, instead of constructors like for...of.

Pipes are a natural fit for iterables because they both focus on the data and the operations we want to apply on them. The following example is what I use to load the posts of this blog:

import it from 'iterated'

const posts: AsyncIterable<Post> = it.pipe(
  postIds(),
  it.map(fetchPost), // This returns a Promise<Post>
  it.await, // transform an Iterable<Promise<Post>> to an AsyncIterable<Post>
  it.filter(post => !post.id.startsWith('_')), // filter private posts
)
A pipe that fetches posts from disk and filters out private ones.

Executing the previous example still does not trigger loading any posts, it only happens when calling an operation that returns an actual data structure.

const response = {
  posts: await it.array(posts)
}
Obtaining an array from the result of the pipe.

Note how the function fetchPost only fetches a single post, ignoring whether it is used in a collection, thanks to map.

A usual problem with pipes is following the shape of the data, specially after applying several transformations. That is why I ensured that Iterated brings TypeScript support, inferring the data type in each step. For example, the following causes TypeScript to complain:

const result = it.pipe(
  'foobar',
  it.filter(x => x < 3) // Error: operator < can't be applied to a string and number
)
An example of iterated's inference. Filter's 'x' is defined as string, and result as Iterable<string>.

About the built-in methods

When handling Arrays, JavaScript’s built-ins use chaining instead of piping:

const withBuiltIn = [1, 2, 3].map().reduce()
const withIterated = it.pipe([1, 2, 3], it.map(), it.reduce())
Comparison between applying map and reduce using the built-in and iterated.

And the JavaScript team is repeating it with the proposal for built-in iterable methods:

const iterator = [1, 2, 3].values()
const withBuiltIn = iterator.map().reduce()

const withIterated = it.pipe([1, 2, 3], it.map(), it.reduce())
Comparison between applying map/reduce with the built-in methods and iterated.

Although both methods work, pipe has an advantage over chain: it is functional. Thanks to this, we can easily allow to use AsyncIterable and, most importantly, it is extensible.

By extensible, I mean adding your functions doesn't feel alien or quirky:

function doStuffWithDevices(devices: Iterable<Device>): Iterable<Device> {
  // Loop devices and so something with them
}

const withIterated = it.pipe([{ id: 'device-1' }], doStuffWithDevices)

// ???
Iterator.prototype.doStuffWithDevices = doStuffWithDevices
const withBuiltIn = [{ id: 'device-1' }].values().doStuffWithDevices()
Comparison of adding a custom function to an iterated pipe and a built-in chain.

And finally, a word of warning with the built-ins as they don't work for all iterables, which iterated does.

Conclusions

Iterables and AsyncIterables are an improvement when working with JavaScript, as they abstract us from collections (e.g., Array, Set, Map, function*) and focus on our data, they iterate only once, and they reduce RAM and increase performance. Issues are changing our mental process and functions to iterables, or embrace pipes.

Pipes are a good way to transform data with iterables as they focus on the transformations we want to do to our data. Although there is a JavaScript proposal for iterable chaining, I believe pipes are more agnostic and extensible, especially when moving between Iterable and AsyncIterable, and supporting any kind of iterable.

Moreover, although there are many good libraries to work with iterables (e.g., iterate-iterator, iterare, the new built-in methods) only iterated brings all the following:

  • Pipes.
  • Great typing support.
  • Iterable and AsyncIterable transparent support.
  • Simple extensibility.
  • Simple interface.

As its biggest drawback, iterated is new and would benefit from more and improved functions.


What do you think? Am I convincing you on iterables (and Iterated)? Are there any goodies, drawbacks, and packages I missed? I would love to read your comments in Mastodon.