Multi-Module Projects

This page describes JavaScript APIs and overall architecture of multi-module projects in Squiggle

Terminology

Project: a class providing the public APIs for running multi-module Squiggle projects; represented by SqProject class
Project state: current state of the project, which includes all modules and their computed outputs; represented by ProjectState class
Module: a single Squiggle file with metadata (name and import pins); represented by SqModule class
Module output: computed values of a module in a specific environment; represented by SqModuleOutput class
Linker: pluggable object that describes how imports are resolved; represented by SqLinker type
Head: named entrypoint to the module graph; all modules that are not addressable by some head will be cleaned up from the state eventually
Resolution: hash-based pointer from the module name to the specific module object
Pin: hash for an import that guarantees immutability

Architecture

SqProject uses the Functional Core, Imperative Shell approach inspired by CodeMirror and Git: its state is immutable and updated by dispatching actions.

When something changes in SqProject state, it will emit events; you can listen to events and react to them in your own code.

Minimal example project

const project = new SqProject();
 
project.setHead('main', {
    module: new SqModule({
        name: 'my-module',
        code: 'x = 2 + 2',
    })
});

Here's how the project state will look when you run the code above:

The following examples use a custom wrapper to <ProjectStateViewer> that allows you to control the loading and running manually.

Every time the project attempts to load or run a module, the operation will be blocked, and you'll see a new button in the top row that can resolve the pending promise.

In a real-world scenario, the loading and running will happen automatically; "Automatic" tab implements that behavior.

0 heads, 0 modules, 0 outputs

Action log

React Flow

In the diagram above, Head is the entrypoint: it points to the main module that the user is interested in.

Next comes the module, my-module. If you hover over the module node, you'll see its source code.

The hash string in the module node is its unique identifier. All modules and outputs are content-addressable; for modules, the hash is a SHA-256 sum of its code, name and import pins.

Module leads to its output. Output is simulated in a specific environment (sampleCount, xyPointLength and a seed).

Note how the code above only added the head, but the output was computed. The process that happens from setting the head to producing the output can be quite complicated:

first, the module is parsed
then its imports are loaded as additional modules
then the imported modules are parsed and their imports are loaded and so on
as soon as there's a module that's (1) fully loaded, (2) all of its imports are computed, it will be computed ("simulated")
this will trigger the simulation of its parents
eventually, the output for the topmost, head module will be produced

This chain of events is internally represented with actions. When you add a head, the first action will be dispatched internally, which might lead to other actions when something is loaded or a simulation was completed.

You can expand the "Action log" in the diagram above to see the actions that were dispatched in this example.

APIs for reacting to project changes

To be notified about SqProject state changes, you can use one of two approaches:

addEventListener

project.addEventListener('state', (event) => {
    console.log(event.data);
});

state event will fire on any state changes
action event will fire when SqProject dispatches any internal action
output event will fire when SqProject produces a new ouptut

Don't forget to clean up a listener; e.g., in a React hook:

useEffect(() => {
    const listener: ProjectEventListener<"action"> = (event) => {
        console.log(event.data);
    };
    project.addEventListener("action", listener);
    return () => project.removeEventListener("action", listener);
});

waitForOutput helper

Event listener mechanism is flexible but verbose, so we also provide this shortcut:

const output = await project.waitForOutput(headName);

Multi-module example

0 heads, 0 modules, 0 outputs

Action log

React Flow

Linkers

SqLinker type is a pluggable object that describes how imports are resolved.

It allows you to customize the way Squiggle modules are resolved and loaded, by defining two hooks: linker.resolve and linker.loadModule.

As an example, import statement can look like this: import "import-name" as importVar.

In this example, "import-name" is a string that will first be resolved by the linker.resolve method to the real module name.

Typically, the normalized name is the same as the name in the import, so linker.resolve will be defined as (name) => name.

In other environments, e.g. in the CLI that relies on the local filesystem, import names can be reslative: import "./foo" as foo, and the linker.resolve function will normalize that to the absolute path.

After the name is resolved, the module is loaded with linker.loadModule method.

For example, on Squiggle Hub, we use a custom linker that fetches hub:* modules from the server through GraphQL.

In the command-line squiggle script, we load modules from the local filesystem.

For now, Squiggle Playground doesn't have a linker, but in the future we plan to add a way to load modules from Squiggle Hub or other sources.

Garbage collection

When you add a head to the project, it will start computing the outputs for all modules that are reachable from the head.

When you update a head, the project will clean up all modules that are not reachable from the remaining set of heads.

Dependency resolving and pinning

This functionality is still untested. We don't use dependency pinning on Squiggle Hub yet.

linker.loadModule method takes two parameters: the module name and its optional hash that comes from pins stored on the parent module object.

If the hash is provided, the linker must return the module with that hash.

If all modules have pins, all the way down to the leaf modules, the project will be completely deterministic and reproducible. This is going to be useful for caching in the future releases of Squiggle.

Performance considerations

There are a few places in SqProject implementation where we make the full scan of the state to decide which parts we need to run next.

This shouldn't matter for small projects, but for large projects with hundreds of modules, it might become a bottleneck. This is something we plan to address in the future.

Multi-Module Projects

On this page