Multi-module projects
This page describes JavaScript APIs and overall architecture of multi-module projects in Squiggle
Terminology
- Project: a class providing the public APIs for running multi-module Squiggle projects; represented by
SqProject
class - Project state: current state of the project, which includes all modules and their computed outputs; represented by
ProjectState
class - Module: a single Squiggle file with metadata (name and import pins); represented by
SqModule
class - Module output: computed values of a module in a specific environment; represented by
SqModuleOutput
class - Linker: pluggable object that describes how imports are resolved; represented by
SqLinker
type - Head: named entrypoint to the module graph; all modules that are not addressable by some head will be cleaned up from the state eventually
- Resolution: hash-based pointer from the module name to the specific module object
- Pin: hash for an import that guarantees immutability
Architecture
SqProject
uses the Functional Core, Imperative Shell approach inspired by CodeMirror and Git: its state is immutable and updated by dispatching actions.
When something changes in SqProject
state, it will emit events; you can listen to events and react to them in your own code.
Minimal example project
Here's how the project state will look when you run the code above:
The following examples use a custom wrapper to <ProjectStateViewer>
that allows you to control the loading and running manually.
Every time the project attempts to load or run a module, the operation will be blocked, and you'll see a new button in the top row that can resolve the pending promise.
In a real-world scenario, the loading and running will happen automatically; "Automatic" tab implements that behavior.
Action log
In the diagram above, Head is the entrypoint: it points to the main module that the user is interested in.
Next comes the module, my-module
. If you hover over the module node, you'll see its source code.
The hash string in the module node is its unique identifier. All modules and outputs are content-addressable; for modules, the hash is a SHA-256 sum of its code, name and import pins.
Module leads to its output. Output is simulated in a specific environment (sampleCount
, xyPointLength
and a seed
).
Note how the code above only added the head, but the output was computed. The process that happens from setting the head to producing the output can be quite complicated:
- first, the module is parsed
- then its imports are loaded as additional modules
- then the imported modules are parsed and their imports are loaded and so on
- as soon as there's a module that's (1) fully loaded, (2) all of its imports are computed, it will be computed ("simulated")
- this will trigger the simulation of its parents
- eventually, the output for the topmost, head module will be produced
This chain of events is internally represented with actions. When you add a head, the first action will be dispatched internally, which might lead to other actions when something is loaded or a simulation was completed.
You can expand the "Action log" in the diagram above to see the actions that were dispatched in this example.
APIs for reacting to project changes
To be notified about SqProject
state changes, you can use one of two approaches:
addEventListener
state
event will fire on any state changesaction
event will fire whenSqProject
dispatches any internal actionoutput
event will fire whenSqProject
produces a new ouptut
Don't forget to clean up a listener; e.g., in a React hook:
waitForOutput
helper
Event listener mechanism is flexible but verbose, so we also provide this shortcut:
Multi-module example
Action log
Linkers
SqLinker type is a pluggable object that describes how imports are resolved.
It allows you to customize the way Squiggle modules are resolved and loaded, by defining two hooks: linker.resolve
and linker.loadModule
.
As an example, import statement can look like this: import "import-name" as importVar
.
In this example, "import-name"
is a string that will first be resolved by the linker.resolve
method to the real module name.
Typically, the normalized name is the same as the name in the import, so linker.resolve
will be defined as (name) => name
.
In other environments, e.g. in the CLI that relies on the local filesystem, import names can be reslative: import "./foo" as foo
, and the linker.resolve
function will normalize that to the absolute path.
After the name is resolved, the module is loaded with linker.loadModule
method.
For example, on Squiggle Hub, we use a custom linker that fetches hub:*
modules from the server through GraphQL.
In the command-line squiggle
script, we load modules from the local filesystem.
For now, Squiggle Playground doesn't have a linker, but in the future we plan to add a way to load modules from Squiggle Hub or other sources.
Garbage collection
When you add a head to the project, it will start computing the outputs for all modules that are reachable from the head.
When you update a head, the project will clean up all modules that are not reachable from the remaining set of heads.
Dependency resolving and pinning
This functionality is still untested. We don't use dependency pinning on Squiggle Hub yet.
linker.loadModule
method takes two parameters: the module name and its optional hash that comes from pins stored on the parent module object.
If the hash is provided, the linker must return the module with that hash.
If all modules have pins, all the way down to the leaf modules, the project will be completely deterministic and reproducible. This is going to be useful for caching in the future releases of Squiggle.
Performance considerations
There are a few places in SqProject
implementation where we make the full scan of the state to decide which parts we need to run next.
This shouldn't matter for small projects, but for large projects with hundreds of modules, it might become a bottleneck. This is something we plan to address in the future.