The kcats Programming Language (Production Implementation)
Table of Contents
- 1. Production implementation
- 2. Issues
- 2.1. INPROGRESS Interactive mode tools
- 2.2. INPROGRESS Implement pipes stdlib
- 2.2.1. DONE Write to a file
- 2.2.2. DONE Read from a file
- 2.2.3. DONE Close a pipe
- 2.2.4. DONE Serialize pipes with something sane
- 2.2.5. DONE Sockets
- 2.2.6. DONE Convert In/Out traits to enums in pipes modules
- 2.2.7. DONE Composable transforms
- 2.2.8. CANCELED Filled pipes
- 2.2.9. INPROGRESS Object pipes
- 2.2.10. DONE Time pipe
- 2.2.11. DONE stdin/stdout pipes
- 2.2.12. CANCELED Pipe take outcome
- 2.3. TODO Error should have actual struct fields optimization
- 2.4. INPROGRESS Script
- 2.5. TODO retry should have opposite argument order stdlib consistency
- 2.6. INPROGRESS Support Kademlia DHT
- 2.7. DONE read and emit don't have quite the same semantics consistency
- 2.8. DONE Inconsistent stack handling when encountering error consistency
- 2.9. DONE Inconsistent expression handling when encountering error
- 2.10. TODO Performance optimizations optimization
- 2.11. INPROGRESS Generators stdlib
- 2.11.1. DONE Basic functionality and generators
- 2.11.2. DONE map
- 2.11.3. DONE filter
- 2.11.4. DONE take
- 2.11.5. DONE drop
- 2.11.6. DONE drop-while (skipper)
- 2.11.7. DONE take-while (catcher)
- 2.11.8. CANCELED last
- 2.11.9. TODO distinct
- 2.11.10. DONE partition
- 2.11.11. DONE joiner (aka catenate)
- 2.11.12. DONE groupby
- 2.11.13. CANCELED Map/filter can't access lower stack items
- 2.11.14. DONE Reduce
- 2.11.15. CANCELED Generator combinators?
- 2.11.16. DONE Applying generator to an existing container
- 2.11.17. INPROGRESS Combinations
- 2.11.18. DONE Frequencies
- 2.12. TODO Make floats hashable
- 2.13. DONE Implement sorting stdlib
- 2.14. TODO Stream transformation
- 2.15. INPROGRESS Select from multiple pipes
- 2.16. TODO Monitoring tools
- 2.17. INPROGRESS Native REPL
- 2.18. CANCELED Words that quote programs instead of executing them
- 2.19. TODO Data compression
- 2.20. TODO Multimethod improvements
- 2.21. CANCELED run multiple programs on same argument to get list
- 2.22. INPROGRESS pairwise operations
- 2.23. DONE Non-generator filter
- 2.24. INPROGRESS Modules
- 2.24.1. Problem statement
- 2.24.2. Discussion
- 2.24.3. Implementation
- 2.24.3.1. TODO take a dictionary and a program and execute the program with that dict
- 2.24.3.2. TODO take a mapping of name to module and return a dictionary with those modules
- 2.24.3.3. TODO One module depends on another, loads it
- 2.24.3.4. TODO Revert back to namespaces
- 2.24.3.5. TODO Fix partition module logic
- 2.24.4. INPROGRESS inscribe currently re-defines words repeatedly at runtime
- 2.24.4.1. INPROGRESS Current design
- 2.24.4.2. INPROGRESS Library loading
- 2.24.4.3. INPROGRESS Nesting scopes
- 2.24.4.4. TODO Stack escape protection
- 2.24.4.5. INPROGRESS Sandboxing support
- 2.24.4.6. INPROGRESS Access control
- 2.24.4.7. INPROGRESS Words can refer to other words in the same library
- 2.24.4.8. TODO Convenient module definition
- 2.24.4.9. TODO convenient 'let'
- 2.24.4.10. INPROGRESS Break up the standard library
- 2.24.4.11. CANCELED Disallow module alias overwriting
- 2.24.4.12. TODO Store data sources
- 2.24.4.13. TODO Find stdlib by alias
- 2.24.4.14. TODO LIbrary loading should be in order of decreasing trust
- 2.24.5. TODO Debugger needs special handling to work with nested environments
- 2.25. INPROGRESS Database
- 2.26. TODO Reduce CPU cost of `shield` optimization
- 2.27. TODO Sort out feature dependencies
- 2.28. TODO Improved error messages errorHandling
- 2.29. INPROGRESS Generate word dependency graph
- 2.30. INPROGRESS Let doesn't inherit the current resolver
- 2.31. DONE Make templating a rust function
- 2.32. TODO Add description to each example testing
- 2.33. TODO Add integration tests testing
- 2.34. TODO Size of option enums
- 2.35. TODO Support converting Association to Set
- 2.36. INPROGRESS Debug nested envs
1. Production implementation
1.1. Base Language
Built in Rust - it's fast and modern, its memory allocation model seems well suited to kcats.
1.2. Status
Unstable - code written in kcats now will likely require modification to work with future versions of the interpreter.
1.3. Building
1.3.1. Dependencies
- rustc
- cargo
1.3.2. Build
Run cargo build --release
, the binary will be placed in ./target/release
by
default.
1.4. Using
1.4.1. Command line REPL
This is the easiest way to get started. Run kcats -r
and it will print
a prompt and wait for you to input items (as many as you like, on a
single line). It will then evaluate all the items and print the
resulting stack. You can then enter more items. It keeps the stack
intact so you're not starting fresh with each input. If you want to
clear the stack, you can use [] restore
.
Use Ctrl-C to quit.
Example session:
~/workspace/kcats $ kcats -r kcats> 1 1 kcats> 2 2 1 kcats> + 3 kcats> [7 8 9] [*] step 1512 kcats>
1.4.2. Command line
Execute kcats
. It will read a program from stdin and execute it,
then print the resulting stack to stdout. You can pass input to it via stdin via
- interactive typing (end input with CTRL-D on most platforms):
kcats
- Piping from a file eg:
kcats < myprog.kcats
- Using echo:
echo "[1 2 3] [inc] map" | kcats
1.4.3. Emacs Interactive REPL
See emacs-ide.org
in the source tree. The elisp files you need to
evaluate are there. Evaluate them, then run M-x kcats-repl
. You may
need to run M-x customize-variable
, kcats-babel-executable
, and enter
the location where you installed the kcats binary.
1.5. Source
1.5.1. Project File
[package] name = "kcats" version = "0.10.0" edition = "2021" # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html [dependencies] # serialization edn-format = "3" serde = "1" serde_json = "1" #edn-format = { path = "../edn-format" } base64 = "0.22" # String literals internment = {version = "0.6.0", features = ["serde"]} lazy_static = "1" num-integer = "0" # String format # dyn-fmt = "0.4.0" dynfmt = { version = "0", features = ["curly"] } # crypto stuff ed25519-dalek = {version="1", features=["batch_deterministic", "std", "rand"]} sha2 = {version="0", features=["std"]} rand_core = "0.5.1" # careful here, having 2 versions present will make weird compile errors rand = "0" # multithreading futures = "0" tokio = { version = "1", features = ["full"] } # multiple-consumer channels #crossbeam-channel = "0.5" # doesn't support async send/recv #async-channel = "1.8.0" flume = "0" # debugging # backtrace = "0.3.61" # database ## Figure out best place to store the db and stdlib files directories = "5" rusqlite = { version = "0", optional = true, features = ["uuid", "bundled"] } # memoized functions once_cell = "1" # The blob cache cache = {path = "./cache"} # Android # android logging libc = "0.2" jni = "0.21" [dependencies.uuid] version = "1" features = [ "v4", # Lets you generate random UUIDs "v7", "fast-rng", # Use a faster (but still sufficiently random) RNG ] #chrono = "0.4.31" [dev-dependencies] test-case = "2" [build-dependencies] directories = "5" sha2 = "0" base64 = "0.22" cache = {path = "./cache"} [features] database = ["rusqlite"] [lib] name = "kcats" crate-type = ["cdylib", "rlib"] path = "src/lib.rs" [[bin]] name = "kcats" path = "src/main.rs"
1.5.2. Interal traits
Because of Rust's orphan rule (you can't implement a trait on a type unless you own either the trait or the type), we'll opt for making our own traits rather than using the "newtype" pattern of making our own types to wrap stdlib types.
use crate::types::container::error::Error; // Define custom traits that mimic std ones /// a trait similar to [std::convert::From] pub trait Derive<T>: Sized { fn derive(value: T) -> Self; } /// a trait similar to [std::convert::TryFrom] pub trait TryDerive<T>: Sized { type Error; fn try_derive(value: T) -> Result<Self, Self::Error>; } /// a trait similar to [std::convert::Into] pub trait Fit<T>: Sized { fn fit(self) -> T; } /// a trait similar to [std::convert::TryInto] pub trait TryFit<T>: Sized { type Error; fn try_fit(self) -> Result<T, Self::Error>; } /// a trait that marks iterable types that can return arbitrary /// numbers of items. For example lists, maps, etc. But not things /// like Result or Option. pub trait IntoList {} pub trait ToIterator { type Item; type IntoIter: Iterator<Item = Self::Item>; fn to_iter(self) -> Self::IntoIter; } // impl<T> ToIterator for Vec<T> { // type Item = T; // type IntoIter = std::vec::IntoIter<T>; // fn to_iter(self) -> Self::IntoIter { // self.into_iter() // } // } pub trait DeriveIterator<A>: Sized { fn derive_iter<T: IntoIterator<Item = A>>(iter: T) -> Self; } pub trait TryDeriveIterator<A>: Sized { fn try_from_iter<I>(l: I) -> Result<Self, Error> where I: IntoIterator<Item = A>; } pub trait MyCollect: Iterator { fn my_collect<B>(self) -> B where B: DeriveIterator<Self::Item>, Self: Sized, { B::derive_iter(self) } } impl<I: Iterator> MyCollect for I {} //blanket impl impl<T, U> Fit<U> for T where U: Derive<T>, { fn fit(self) -> U { U::derive(self) } } impl<T, U> TryFit<U> for T where U: TryDerive<T>, { type Error = U::Error; fn try_fit(self) -> Result<U, U::Error> { U::try_derive(self) } } impl<T> Derive<T> for T { fn derive(value: T) -> T { value } } impl<T> TryDerive<T> for T { type Error = std::convert::Infallible; fn try_derive(value: T) -> Result<Self, Self::Error> { Ok(value) } } pub trait Fresh { fn fresh() -> Self; }
1.5.3. Internal data types
1.5.3.1. Basic internal types
We'll start by defining the basic data structures that kcats will use internally, to keep track of things like the stack, program, lists etc.
//! Defines kcats internal data types. use crate::list; use crate::traits::*; use crate::types::container as coll; use crate::types::container::dictionary as dict; use crate::types::container::environment as env; use crate::types::container::error::Error; use crate::types::container::Mutey; use core::default::Default; use core::fmt; use internment::Intern; use lazy_static::lazy_static; use std::collections::{HashMap, VecDeque}; use std::hash::Hash; use std::marker::Sync; use std::pin::Pin; pub mod container; pub mod number; /// A Word causes a kcats program to do something, usually taking some /// items derive the top of the stack, and using them to create new /// stack items. (examples: `swap`, `+`, `dip`). #[derive(Clone, Eq, PartialOrd, Ord, Default, Hash, PartialEq)] pub struct Word { pub data: Intern<String>, pub quoted: bool, } impl fmt::Debug for Word { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { if self.quoted { write!(f, "[{}]", self.data.as_str()) } else { write!(f, "{}", self.data.as_str()) } } } impl Derive<String> for Word { fn derive(s: String) -> Self { Word::derive(s.as_str()) } } impl Derive<&str> for Word { fn derive(s: &str) -> Self { Word { data: Intern::<String>::from(s), quoted: false, } } } impl<'a> Derive<&'a Word> for &'a str { fn derive(s: &'a Word) -> Self { s.data.as_str() } } impl Derive<Word> for String { fn derive(s: Word) -> Self { s.data.to_string() } } /// Represents a stack (the part of an /// [crate::types::container::environment::Environment] that holds the data /// values being manipulated by the program). pub type Stack = container::List; /// A byte array type pub type Bytes = Vec<u8>; /// A character type pub type Char = char; // Some static values for commonly used words lazy_static! { pub static ref S_ASSOC: Word = "association".fit(); pub static ref S_BOOLEAN: Word = "boolean".fit(); pub static ref S_BYTES: Word = "bytes".fit(); pub static ref S_CHAR: Word = "character".fit(); pub static ref S_DICTIONARY: Word = "dictionary".fit(); pub static ref S_DISPENSER: Word = "dispenser".fit(); pub static ref S_ENVIRONMENT: Word = "environment".fit(); pub static ref S_ERROR: Word = "error".fit(); pub static ref S_FLOAT: Word = "float".fit(); pub static ref S_INTEGER: Word = "integer".fit(); pub static ref S_ITEM: Word = "item".fit(); pub static ref S_LIST: Word = "list".fit(); pub static ref S_NUMBER: Word = "number".fit(); pub static ref S_ORDERED: Word = "ordered".fit(); pub static ref S_PIPE: Word = "pipe".fit(); pub static ref S_PROGRAM: Word = "program".fit(); pub static ref S_RECEPTACLE: Word = "receptacle".fit(); pub static ref S_SIZED: Word = "sized".fit(); pub static ref S_STRING: Word = "string".fit(); pub static ref S_WORD: Word = "word".fit(); } /// A kcats data value. #[derive(Clone)] pub enum Item { /// A number value Number(number::Number), /// A word value. Words are atomic, they can't be broken down into /// characters like Strings. Word(Word), /// A character value, like 'a', or '\n'. Char(Char), /// A container value (that [Item]s can be taken from) Dispenser(coll::Dispenser), /// A container value (that [Item]s can be put into) Receptacle(coll::Receptacle), } impl fmt::Debug for Item { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { match self { Item::Number(n) => write!(f, "{:?}", n), Item::Word(w) => write!(f, "{:?}", w), Item::Char(c) => write!(f, "Char{:?}", c), Item::Dispenser(d) => write!(f, "{:?}", d), Item::Receptacle(r) => write!(f, "{:?}", r), } } } impl Item { /// Returns whether the item is empty - only containers can be empty. pub fn is_empty(&self) -> bool { match self { Item::Dispenser(coll::Dispenser::Sized(s)) => s.is_empty(), Item::Receptacle(coll::Receptacle::Sized(s)) => s.is_empty(), _ => false, } } } /// A Future value, used for async execution, which is how /// multithreading is implemented in kcats. pub type Future<T> = Pin<Box<dyn std::future::Future<Output = T> + Send>>; /// A type for a function that advances the execution of a kcats /// [env::Environment] by one step. pub type StepFn = dyn Fn(env::Environment) -> Future<env::Environment> + Sync + Send; impl PartialEq for Item { fn eq(&self, other: &Self) -> bool { match (self, other) { // same types, just use their own eq (Item::Number(a), Item::Number(b)) => a == b, (Item::Word(a), Item::Word(b)) => a.data == b.data, ( Item::Dispenser(coll::Dispenser::Sized(a)), Item::Receptacle(coll::Receptacle::Sized(b)), ) => a == b, ( Item::Receptacle(coll::Receptacle::Sized(a)), Item::Dispenser(coll::Dispenser::Sized(b)), ) => a == b, (Item::Dispenser(a), Item::Dispenser(b)) => a == b, (Item::Receptacle(a), Item::Receptacle(b)) => a == b, (Item::Char(a), Item::Char(b)) => a == b, _ => false, } } } /// The default Item is empty list. impl Default for Item { fn default() -> Self { coll::Dispenser::default().fit() } } impl TryDerive<Item> for String { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let s = coll::Sized::try_derive(i)?; match s { coll::Sized::String(i) => Ok(i), i => Err(Error::expected("string", i)), } } } /// Converts Item to Word but also considers a quoted word as a word, /// eg \[foo\] -> foo. impl TryDerive<Item> for Word { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match i { Item::Word(i) => Ok(i), i => { let s = coll::Sized::try_derive(i)?; match s { coll::Sized::String(s) => Ok(s.fit()), s => { let i2 = s.clone(); let l = coll::List::try_derive(s); match l { Ok(mut l) => { if l.len() == 1 { let lm = l.mutate(); let i = lm.pop_front().unwrap(); i.try_fit() } else { Err(Error::expected("word", l)) } } Err(_) => Err(Error::expected("word", i2)), } } } } } } } impl TryDerive<Item> for Bytes { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let s = coll::Sized::try_derive(i)?; match s { coll::Sized::Bytes(b) => Ok(b), b => Err(Error::expected("bytes", b)), } } } impl TryDerive<Item> for char { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match i { Item::Char(c) => Ok(c), b => Err(Error::expected("char", b)), } } } /// As there are no real booleans, we use the word 'yes' but literally /// any value except empty containers is truthy. If we read a value /// 'false', that's not actually a boolean, it's just the [Word] /// false. The fact that the word 'yes' is used in the language but /// 'no' is not, is a known tradeoff. impl Derive<bool> for Item { fn derive(b: bool) -> Item { if b { "yes".fit() } else { Item::default() } } } impl From<std::io::Error> for Error { fn from(err: std::io::Error) -> Error { Error::create(list!("io"), &err.to_string(), Option::<Item>::None) } } impl Derive<&str> for Item { fn derive(i: &str) -> Self { Item::Word(Word::derive(i)) } } impl Derive<String> for Item { fn derive(i: String) -> Self { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::String(i))) } } impl Derive<Bytes> for Item { fn derive(b: Bytes) -> Self { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(b))) } } impl Derive<Word> for Item { fn derive(w: Word) -> Self { Item::Word(w) } } impl Derive<Char> for Item { fn derive(c: Char) -> Self { Item::Char(c) } } impl Derive<()> for Item { fn derive(_: ()) -> Self { Item::default() } } impl<T> Derive<Option<T>> for Item where Item: Derive<T>, { fn derive(opt: Option<T>) -> Item { match opt { Some(t) => Item::derive(t), None => Item::default(), } } } /// A generic impl to convert an Item to a vec of the given /// type. Assumes the Item is some sort of container and converts each /// item in the container. impl<T: TryDerive<Item, Error = Error>> TryDerive<Item> for Vec<T> { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { // First try to convert the Item to an IntoIterator<Item> let it: Box<dyn Iterator<Item = Item>> = i.try_fit()?; it.map(T::try_derive).collect() } } /// A macro to build a kcats List, accepts any values that are /// convertible to [Item]. #[macro_export] macro_rules! list { ( $( $x:expr ),* $(,)? ) => { { use $crate::traits::*; use $crate::types::Item; let v: Vec<Item> = vec![ $( $x.fit(), )* ]; $crate::types::container::List::derive(v) } }; } mod serde { //! Support for json serialization of kcats objects use super::Item; use crate::traits::*; use crate::types::container as coll; use crate::types::container::associative as assoc; use crate::types::number; use crate::types::Error; use serde::de::{self, Deserialize, Deserializer, Visitor}; use serde::ser::{Serialize, Serializer}; use std::collections::HashMap; use std::fmt; struct ItemVisitor; impl<'de> Visitor<'de> for ItemVisitor { type Value = Item; fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result { formatter.write_str("expected a specific representation for Item") } fn visit_i64<E>(self, value: i64) -> Result<Self::Value, E> where E: de::Error, { Ok(Item::Number(number::Number::Int(value))) } fn visit_u64<E>(self, value: u64) -> Result<Self::Value, E> where E: de::Error, { Ok(Item::Number(number::Number::Int(value as i64))) } fn visit_f64<E>(self, value: f64) -> Result<Self::Value, E> where E: de::Error, { Ok(Item::Number(number::Number::Float(value))) } fn visit_none<E>(self) -> Result<Self::Value, E> where E: de::Error, { Ok(Item::default()) } fn visit_bool<E>(self, v: bool) -> Result<Self::Value, E> where E: de::Error, { Ok(Item::derive(v)) } fn visit_str<E>(self, v: &str) -> Result<Self::Value, E> where E: de::Error, { Ok(Item::Dispenser(coll::Dispenser::Sized( coll::Sized::String(v.to_string()), ))) } fn visit_byte_buf<E>(self, v: Vec<u8>) -> Result<Self::Value, E> where E: de::Error, { Ok(Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes( v, )))) } fn visit_map<A>(self, mut ma: A) -> Result<Self::Value, A::Error> where A: de::MapAccess<'de>, { let mut map = HashMap::new(); while let Some((key, value)) = ma.next_entry::<assoc::KeyItem, Item>()? { map.insert(key, value); } Ok(Item::Dispenser(coll::Dispenser::Sized( coll::Sized::Associative(assoc::Associative::Assoc(map.fit())), ))) } fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error> where A: de::SeqAccess<'de>, { let mut items = Vec::new(); while let Some(item) = seq.next_element::<Item>()? { items.push(item); } Ok(coll::List::derive(items).fit()) } } impl<'de> Deserialize<'de> for Item { fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> where D: Deserializer<'de>, { deserializer.deserialize_any(ItemVisitor) } } impl From<serde_json::Error> for Error { fn from(err: serde_json::Error) -> Error { Error::create(list!("serialize"), &err.to_string(), Option::<Item>::None) } } impl Serialize for Item { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: Serializer, { match self { Item::Number(num) => num.serialize(serializer), Item::Char(c) => serializer.serialize_char(*c), Item::Word(w) => serializer.serialize_str(w.fit()), // Handle other variants Item::Dispenser(ref dispenser) => dispenser.serialize(serializer), Item::Receptacle(ref receptacle) => receptacle.serialize(serializer), } } } }
1.5.3.2. Number types
//! Support for numbers in kcats. Currently just [i64] and [f64], but //! this module will eventually support bignums and autopromotion. use super::container::error::Error; use crate::traits::*; use crate::types::container as cont; use crate::types::Item; use num_integer::Roots; use serde::ser::{Serialize, Serializer}; use std::num::{ParseFloatError, ParseIntError}; /// An integer type pub type Int = i64; /// A floating point type pub type Float = f64; #[derive(Clone, Debug)] pub enum Number { Int(Int), Float(Float), } impl Number { pub fn add(&self, other: Number) -> Number { match (self, other) { (Number::Int(i), Number::Int(j)) => Number::Int(i + j), (Number::Float(i), Number::Float(j)) => Number::Float(i + j), (Number::Int(i), Number::Float(j)) => Number::Float(*i as Float + j), (Number::Float(i), Number::Int(j)) => Number::Float(i + j as Float), } } pub fn subtract(&self, other: Number) -> Number { match (self, other) { (Number::Int(i), Number::Int(j)) => Number::Int(i - j), (Number::Float(i), Number::Float(j)) => Number::Float(i - j), (Number::Int(i), Number::Float(j)) => Number::Float(*i as Float - j), (Number::Float(i), Number::Int(j)) => Number::Float(i - j as Float), } } pub fn multiply(&self, other: Number) -> Number { match (self, other) { (Number::Int(i), Number::Int(j)) => Number::Int(i * j), (Number::Float(i), Number::Float(j)) => Number::Float(i * j), (Number::Int(i), Number::Float(j)) => Number::Float(*i as Float * j), (Number::Float(i), Number::Int(j)) => Number::Float(i * j as Float), } } pub fn divide(i: Float, j: Float) -> Result<Float, Error> { let q = i / j; if q.is_nan() { Err(Error::division_by_zero()) } else { Ok(q) } } pub fn gt(i: Number, j: Number) -> bool { match (i, j) { (Number::Int(i), Number::Int(j)) => i > j, (Number::Float(i), Number::Float(j)) => i > j, (Number::Int(i), Number::Float(j)) => i as Float > j, (Number::Float(i), Number::Int(j)) => i > j as Float, } } pub fn lt(i: Number, j: Number) -> bool { match (i, j) { (Number::Int(i), Number::Int(j)) => i < j, (Number::Float(i), Number::Float(j)) => i < j, (Number::Int(i), Number::Float(j)) => (i as Float) < j, (Number::Float(i), Number::Int(j)) => i < j as Float, } } pub fn gte(i: Number, j: Number) -> bool { match (i, j) { (Number::Int(i), Number::Int(j)) => i >= j, (Number::Float(i), Number::Float(j)) => i >= j, (Number::Int(i), Number::Float(j)) => (i as Float) >= j, (Number::Float(i), Number::Int(j)) => i >= j as Float, } } pub fn lte(i: Number, j: Number) -> bool { match (i, j) { (Number::Int(i), Number::Int(j)) => i <= j, (Number::Float(i), Number::Float(j)) => i <= j, (Number::Int(i), Number::Float(j)) => (i as Float) <= j, (Number::Float(i), Number::Int(j)) => i <= j as Float, } } pub fn abs(&self) -> Number { match self { Number::Int(i) => Number::Int(i.abs()), Number::Float(f) => Number::Float(f.abs()), } } pub fn sqrt(&self) -> Number { match self { Number::Int(i) => Number::Int(i.sqrt()), Number::Float(f) => Number::Float(f.sqrt()), } } } impl PartialEq for Number { fn eq(&self, other: &Self) -> bool { match (self, other) { (Number::Int(a), Number::Int(b)) => a == b, (Number::Float(a), Number::Float(b)) => a == b, (Number::Float(a), Number::Int(b)) => *a == *b as Float, (Number::Int(a), Number::Float(b)) => *a as Float == *b, } } } impl TryDerive<Number> for Float { type Error = Error; fn try_derive(i: Number) -> Result<Self, Self::Error> { match i { Number::Float(i) => Ok(i), i => Err(Error::expected("float", i)), } } } impl TryDerive<Number> for Int { type Error = Error; fn try_derive(i: Number) -> Result<Self, Self::Error> { match i { Number::Int(i) => Ok(i), i => Err(Error::expected("integer", i)), } } } impl Derive<Int> for Item { fn derive(c: Int) -> Self { Item::Number(Number::Int(c)) } } impl Derive<Float> for Item { fn derive(c: Float) -> Self { Item::Number(Number::Float(c)) } } impl From<ParseIntError> for Error { fn from(e: ParseIntError) -> Self { Error::parse(e.to_string().as_str()) } } impl From<ParseFloatError> for Error { fn from(e: ParseFloatError) -> Self { Error::parse(e.to_string().as_str()) } } impl TryDerive<Item> for Number { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let fromstr = |s: String| { let r = s .as_str() .parse::<i64>() .map(Number::Int) .map_err(Error::from); r.or_else(|_| s.as_str().parse::<Float>().map(Number::Float)) .map_err(Error::from) }; match i { Item::Number(i) => Ok(i), Item::Char(c) => Ok(Number::Int(c as Int)), Item::Dispenser(cont::Dispenser::Sized(cont::Sized::String(s))) => fromstr(s), Item::Receptacle(cont::Receptacle::Sized(cont::Sized::String(s))) => fromstr(s), i => Err(Error::expected("number", i)), } } } impl TryDerive<Item> for Int { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match Number::try_derive(i)? { Number::Int(i) => Ok(i), i => Err(Error::expected("integer", i)), } } } impl TryDerive<Item> for Float { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match Number::try_derive(i)? { Number::Float(i) => Ok(i), i => Err(Error::expected("float", i)), } } } impl Derive<Number> for Item { fn derive(c: Number) -> Self { Item::Number(c) } } impl Serialize for Number { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: Serializer, { match self { Number::Int(i) => serializer.serialize_i64(*i), Number::Float(f) => serializer.serialize_f64(*f), } } }
1.5.3.3. Container types
//! Support for containers in kcats. Includes types like [List], //! [Set], [associative::Association], String, [pipe::In], //! [pipe::Out], and Byte arrays. The container contract is you can //! put things into, or take things out of them. [Receptacle]s are for //! putting into, and [Dispenser]s are for taking out of. For //! underlying types that support both operations (like [List]), we //! can easily convert between [Receptacle] and [Dispenser] as needed. pub mod associative; pub mod dictionary; pub mod environment; pub mod error; pub mod pipe; use futures::FutureExt; use self::associative as assoc; use crate::traits::*; use crate::types::container::error::Nested; use crate::types::container::pipe::FutureTake; use crate::types::number::{Int, Number}; use crate::types::*; use core::fmt; use std::convert::Infallible; use std::{collections::HashSet, future, sync}; use sync::Arc; /// A generic Arc type that we control //pub type Arc<T> = Newtype<sync::Arc<T>>; /// A generic List type pub type Listy<I> = VecDeque<I>; /// A generic Set type pub type Setty<I> = HashSet<I>; /// A specific List type pub type ListContent = Listy<Item>; pub type List = Arc<ListContent>; pub type Set = Arc<Setty<assoc::KeyItem>>; impl Derive<HashSet<assoc::KeyItem>> for Set { fn derive(h: HashSet<assoc::KeyItem>) -> Set { Arc::new(h) } } impl Derive<ListContent> for List { fn derive(l: ListContent) -> List { Arc::new(l) } } impl Derive<ListContent> for Item { fn derive(l: ListContent) -> Item { List::derive(l).fit() } } impl DeriveIterator<Item> for List { fn derive_iter<I>(iter: I) -> Self where I: IntoIterator<Item = Item>, { Arc::new(iter.into_iter().collect::<VecDeque<Item>>()) } } impl DeriveIterator<Char> for List { fn derive_iter<I>(iter: I) -> Self where I: IntoIterator<Item = Char>, { Arc::new( iter.into_iter() .map(Item::derive) .collect::<VecDeque<Item>>(), ) } } // impl DeriveIterator<assoc::KeyItem> for Set { // fn derive_iter<I>(iter: I) -> Self // where // I: IntoIterator<Item = assoc::KeyItem>, // { // sync::Arc::new(iter.into_iter().collect::<HashSet<assoc::KeyItem>>()) // } // } impl<T> Fresh for Arc<T> where T: Default, { fn fresh() -> Self { Arc::new(T::default()) } } /// A trait for joining two values together. There are some precedence rules: /// /// 1. If there are two different types being joined, the type that is /// returned is either the most specialized types of the two being /// joined, or the most specialized type that's possible to construct /// given the two values. (For example, joining a Set with a List, or /// vice versa, will always be a Set. Joining an Association with a /// Dictionary will be an Associative enum but the variant will depend /// on whether the Association data fits the schema of a /// Dictionary. If so, it will be Dictionary, otherwise Assoc.) /// /// 2. If the result type is keyed, (eg, Map or Set or struct types), /// the RHS argument's keys take precedence over self's. pub trait Join<RHS> { type Output; type Error; fn join(self, rhs: RHS) -> Result<Self::Output, Self::Error>; } impl Join<&str> for String { type Output = String; type Error = Infallible; fn join(mut self, rhs: &str) -> Result<Self::Output, Self::Error> { self.push_str(rhs); Ok(self) } } impl Join<char> for String { type Output = String; type Error = Infallible; fn join(mut self, rhs: char) -> Result<Self::Output, Self::Error> { self.push(rhs); Ok(self) } } impl Join<List> for List { type Output = List; type Error = Infallible; fn join(mut self, rhs: List) -> Result<Self::Output, Self::Error> { //println!("Joining list to list"); let am = self.mutate(); am.extend(rhs.iter().cloned()); Ok(self) } } impl Join<Set> for Set { type Output = Set; type Error = Infallible; fn join(mut self, rhs: Set) -> Result<Self::Output, Self::Error> { let am = self.mutate(); am.extend(rhs.iter().cloned()); Ok(self) } } /// When joining a List with a String, which type we get back depends /// on the contents of the list. If the list has non-char items in it, /// we get a List. Otherwise, a string. impl Join<String> for List { type Output = Sized; type Error = Infallible; fn join(mut self, rhs: String) -> Result<Self::Output, Self::Error> { match self .iter() .cloned() .map(|i| Char::try_derive(i)) .collect::<Result<Vec<Char>, Error>>() { Ok(vc) => { //join as string if all the list items are chars let mut x = vc.iter().collect::<String>(); x.push_str(rhs.as_str()); Ok(Sized::String(x)) } Err(_) => { // join as list self.mutate().extend(rhs.chars().map(Item::derive)); Ok(Sized::List(self)) } } } } /// When joining a String with a List, which type we get back depends /// on the contents of the list. If the list has non-char items in it, /// we get a List. Otherwise, a string. impl Join<List> for String { type Output = Sized; type Error = Infallible; fn join(mut self, rhs: List) -> Result<Self::Output, Self::Error> { match rhs .iter() .cloned() .map(|i| Char::try_derive(i)) .collect::<Result<Vec<Char>, Error>>() { Ok(vc) => { //join as string if all the list items are chars let x = vc.iter().collect::<String>(); self.push_str(x.as_str()); Ok(Sized::String(self)) } Err(_) => { // join as list let mut sl: List = self.fit(); sl.mutate().extend(rhs.iter().cloned()); Ok(Sized::List(sl)) } } } } impl Join<List> for assoc::Associative { type Output = assoc::Associative; type Error = Error; fn join(self, other: List) -> Result<Self::Output, <Self as Join<List>>::Error> { //println!("Joining list to associative"); let la = assoc::Associative::Assoc(assoc::Association::try_from_iter(other.iter().cloned())?); Ok(self.join(la).unwrap()) } } impl Join<assoc::Associative> for List { type Output = assoc::Associative; type Error = Error; fn join(self, other: assoc::Associative) -> Result<Self::Output, Self::Error> { //println!("Joining associative to list"); let sa = assoc::Associative::Assoc(assoc::Association::try_from_iter(self.iter().cloned())?); Ok(sa.join(other).unwrap()) } } /// Joining a List with a Set will be a set. impl Join<Set> for List { type Output = Set; type Error = Error; fn join(self, mut other: Set) -> Result<Self::Output, Self::Error> { let bm = other.mutate(); bm.extend( self.iter() .cloned() .map(assoc::KeyItem::try_derive) .collect::<Result<Vec<assoc::KeyItem>, Error>>()?, ); Ok(other) } } impl Join<String> for String { type Output = String; type Error = Infallible; fn join(mut self, other: String) -> Result<Self::Output, Self::Error> { self.push_str(&other); Ok(self) } } /// Joins two containers into one. impl Join<Sized> for Sized { type Output = Sized; type Error = Error; fn join(self, other: Sized) -> Result<Self::Output, Self::Error> { //println!("Joining sized {:?} to sized {:?}", self, other); Ok(match (self, other) { (Sized::Associative(a), Sized::List(l)) => Sized::Associative(a.join(l)?), (Sized::List(l), Sized::Associative(a)) => Sized::Associative(l.join(a)?), (Sized::Associative(a), Sized::Associative(b)) => { Sized::Associative(a.join(b).unwrap()) } (Sized::List(a), Sized::List(b)) => Sized::List(a.join(b).unwrap()), (Sized::Set(a), Sized::Set(b)) => Sized::Set(a.join(b).unwrap()), (Sized::List(a), Sized::Set(b)) => Sized::Set(a.join(b)?), (Sized::Set(mut a), Sized::List(b)) => { let am = a.mutate(); am.extend( b.iter() .cloned() .map(assoc::KeyItem::try_derive) .collect::<Result<Vec<assoc::KeyItem>, Error>>()?, ); Sized::Set(a) } (Sized::String(mut a), Sized::String(b)) => { a.push_str(&b); Sized::String(a) } (Sized::Bytes(mut a), Sized::Bytes(b)) => { a.extend(b); Sized::Bytes(a) } (Sized::String(s), Sized::List(l)) => s.join(l).unwrap(), (Sized::List(l), Sized::String(s)) => l.join(s).unwrap(), (s, other) => { if s.is_empty() { other } else if other.is_empty() { s } else { Err(Error::expected("joinable", list!(s, other)))? } } }) } } pub trait Container<T> { fn has(&self, item: &T) -> bool; } pub trait Count { fn count(&self) -> usize; } /// A trait for containers where you can take an item out "in-place" /// without blocking. The container itself is mutated and the item is /// returned. pub trait SimpleTake { type Item: Send + Fit<Item>; fn take_simple(&mut self) -> Option<Self::Item>; } // pub trait DemotingTake { // type Item; // type Output; // fn take_demoting(self) -> (Option<Self::Item>, Self::Output); // } pub trait Take { type Output; type Item; fn take(self) -> Future<(Result<Option<Self::Item>, Error>, Self::Output)>; } /// A blanket impl for Take, for any type that already implements SimpleTake. impl<T> Take for T where T: SimpleTake + Send + 'static, { type Output = T; type Item = Item; fn take(mut self) -> Future<(Result<Option<Self::Item>, Error>, Self::Output)> { let item = self.take_simple(); let result = future::ready((Ok(item.map(|i| i.fit())), self)); Box::pin(result) } } impl Count for Sized { fn count(&self) -> usize { match self { Self::Associative(a) => a.len(), Self::List(l) => l.count(), Self::String(s) => s.len(), Self::Bytes(b) => b.len(), Self::Set(s) => s.len(), } } } impl Container<Item> for Sized { fn has(&self, other: &Item) -> bool { //println!("Has: {:?}\n{:?}", self, other); match (self, other) { (Sized::Associative(a), other) => { assoc::KeyItem::try_derive(other.clone()).map_or(false, |k| a.contains_key(&k)) } (Sized::List(l), other) => l.contains(other), (Sized::Set(s), Item::Dispenser(Dispenser::Sized(Sized::Set(other)))) => { other.is_subset(s) } (Sized::Set(s), Item::Receptacle(Receptacle::Sized(Sized::Set(other)))) => { other.is_subset(s) } (Sized::Set(s), other) => { assoc::KeyItem::try_derive(other.clone()).map_or(false, |k| s.contains(&k)) } (Sized::String(container), other) => match other { Item::Char(c) => container.has(c), i => match String::try_derive(i.clone()) { Ok(ref s) => container.has(s), Err(_) => false, }, }, _ => false, } } } impl SimpleTake for Sized { //type Output = Self; type Item = Item; fn take_simple(&mut self) -> Option<Self::Item> { //println!("Taking! {:?}", self); match self { Self::Associative(ref mut a) => { let v = a.take_simple(); *self = Sized::Associative(a.clone()); v } Sized::List(ref mut l) => l.take_simple(), Sized::String(ref mut s) => s.take_simple().map(Item::derive), Sized::Bytes(ref mut b) => b.take_simple().map(Item::derive), Sized::Set(ref mut s) => s.take_simple(), } } } impl Count for String { fn count(&self) -> usize { self.len() } } impl Container<char> for String { fn has(&self, item: &char) -> bool { self.contains(*item) } } impl Container<String> for String { fn has(&self, item: &String) -> bool { self.contains(item.as_str()) } } pub trait Ordered { /// Appends the items to the beginning of this list, preserving /// their order. eg `[1, 2, 3].append([4, 5, 6])` -> `[4, 5, 6, 1, /// 2, 3]`. fn prepend(&mut self, items: List); /// Appends the items in the iterator to the beginning of this /// list, preserving order. fn prepend_iter<T: IntoIterator<Item = Item>>(&mut self, items: T); /// Reverses the order of the list. fn reverse(&mut self); } pub trait Mutey<T> { fn mutate(&mut self) -> &mut T; } impl<T: Clone> Mutey<T> for Arc<T> { fn mutate(&mut self) -> &mut T { Arc::make_mut(self) } } impl Count for List { fn count(&self) -> usize { self.len() } } impl Container<Item> for List { fn has(&self, i: &Item) -> bool { self.contains(i) } } // impl Take for List { // type Item = Item; // type Output = List; // fn take(mut self) -> Future<(Self::Output, Result<Option<Item>, Error>)> { // let v = self.mutate().pop_front(); // Box::pin(future::ready((self, Ok(v)))) // } // } //impl Take for impl Ordered for List { fn prepend(&mut self, items: List) { self.prepend_iter(items.iter().cloned()); } fn prepend_iter<T: IntoIterator<Item = Item>>(&mut self, items: T) { let m = self.mutate(); let ct = m.len(); m.extend(items); m.rotate_left(ct); } fn reverse(&mut self) { let m = self.mutate(); m.make_contiguous().reverse(); } } /// A generic container type, all we know is it can contain multiple /// items. Includes things like lists, sets, and IO channels. Items /// can be taken out. #[derive(Clone, PartialEq)] pub enum Dispenser { /// A container with a known number of items inside Sized(Sized), /// A pipe that dispenses an unknown number of items Out(pipe::Out), /// Similar to Out but also convertible to [Receptacle] Tunnel(pipe::Tunnel), } impl fmt::Debug for Dispenser { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { match self { Dispenser::Sized(sized) => write!(f, "{:?}", sized), Dispenser::Out(out) => write!(f, "{:?}", out), Dispenser::Tunnel(tunnel) => write!(f, "{:?}", tunnel), } } } /// A generic container type, all we know is it can contain multiple /// items. Includes things like lists, sets, and IO channels. Items /// can be put in. #[derive(Clone, PartialEq)] pub enum Receptacle { /// A container with a known number of items inside Sized(Sized), /// A pipe that can receive an arbitrary number of items In(pipe::In), /// Similar to In but also convertible to [Dispenser] Tunnel(pipe::Tunnel), } impl fmt::Debug for Receptacle { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { match self { Receptacle::Sized(sized) => write!(f, "{:?}", sized), Receptacle::In(i) => write!(f, "{:?}", i), Receptacle::Tunnel(tunnel) => write!(f, "{:?}", tunnel), } } } /// Collections that have a definite size that we can access. Implies /// that it can also be appended to. #[derive(Clone)] pub enum Sized { /// Associative containers associate Items in pairs, like Map or /// Dict in other languages. Associative(assoc::Associative), /// List containers have multiple Items in a specific order. List(List), /// Set containers have multiple Items in no particular order, and /// each Item can only appear once. Set(Set), //TODO: these should be inside an Arc too /// A String is a chunk of text, like a list of individual /// characters. String(String), /// Bytes is the lowest common denominator form of data, useful /// for when no other type applies. Bytes(Bytes), } impl fmt::Debug for Sized { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { match self { Sized::Associative(a) => write!(f, "{:?}", a), Sized::List(l) => write!(f, "{:?}", l), Sized::Set(s) => write!(f, "{:?}", s), Sized::String(s) => write!(f, "{:?}", s), Sized::Bytes(b) => write!(f, "{:?}", b), } } } /// Empty Sized containers are equal to each other. impl PartialEq for Sized { fn eq(&self, other: &Self) -> bool { match (self, other) { (Sized::Associative(a), Sized::Associative(b)) => a == b, (Sized::List(a), Sized::List(b)) => a == b, (Sized::String(a), Sized::String(b)) => a == b, (Sized::Bytes(a), Sized::Bytes(b)) => a == b, (Sized::Set(a), Sized::Set(b)) => a == b, _ => self.is_empty() && other.is_empty(), } } } /// Takes an item out of the [Dispenser], and returns a future /// that gives a new [Dispenser], and the [Item] that was removed /// (if there was one). impl Take for Dispenser { type Output = Self; type Item = Item; fn take(self) -> Future<(Result<Option<Self::Item>, Error>, Self::Output)> { match self { Dispenser::Sized(mut s) => { let v = s.take_simple(); //let (r, s) = s.take(); // i.map(|r| { // (Dispenser::SIzed(s), Self::result_to_option(r)) // }) Box::pin(future::ready((Ok(v), Dispenser::Sized(s)))) } Dispenser::Out(mut o) => { Box::pin(async move { (o.take_future().await, Dispenser::Out(o)) }) } Dispenser::Tunnel(mut t) => { Box::pin(async move { (t.take_future().await, Dispenser::Tunnel(t)) }) } } } } pub fn result_to_option(r: Result<Option<Item>, Error>) -> Option<Item> { match r { Ok(Some(i)) => Some(i), Ok(None) => None, Err(e) => Some(Item::derive(e)), } } impl Dispenser { // /// Takes an item out of the [Dispenser], and returns a future // /// that gives a new [Dispenser], and the [Item] that was removed // /// (if there was one). // pub fn take(self) -> Future<(Dispenser, Option<Item>)> { // match self { // Dispenser::Sized(mut s) => { // let v = s.take_simple(); // //let (r, s) = s.take(); // // i.map(|r| { // // (Dispenser::SIzed(s), Self::result_to_option(r)) // // }) // Box::pin(future::ready((Dispenser::Sized(s), v))) // } // Dispenser::Out(mut o) => Box::pin({ // let i = o.take(); // i.map(|r| (Dispenser::Out(o), Self::result_to_option(r))) // }), // Dispenser::Tunnel(mut t) => Box::pin({ // let i = t.take(); // i.map(|r| { // ( // Dispenser::Tunnel(t), // match r { // Ok(Some(i)) => Some(i), // Ok(None) => None, // Err(e) => Some(Item::derive(e)), // }, // ) // }) // }), // } // } } impl SimpleTake for List { type Item = Item; fn take_simple(&mut self) -> Option<Self::Item> { if self.is_empty() { None } else { let lm = self.mutate(); let i = lm.pop_front(); i } } } impl SimpleTake for String { type Item = char; fn take_simple(&mut self) -> Option<Self::Item> { // TODO: this may perform badly let first_char = self.chars().next(); self.drain(..first_char.map(|s| s.len_utf8()).unwrap_or(0)); first_char } } impl SimpleTake for Bytes { type Item = Int; fn take_simple(&mut self) -> Option<Self::Item> { if self.is_empty() { None } else { let i = Some(self[0] as Int); self.drain(..1); i } } } impl SimpleTake for Set { type Item = Item; fn take_simple(&mut self) -> Option<Self::Item> { let sm = self.mutate(); sm.iter().next().map(|v| v.clone().fit()) } } impl Sized { /// Returns whether the container is empty pub fn is_empty(&self) -> bool { self.count() == 0 } /// Takes an item from the back (end) of the container. pub fn pop(self) -> (Self, Option<Item>) { match self { Sized::Associative(mut a) => { let v = a.take_simple(); (Sized::Associative(a), v) } Sized::List(mut l) => { let lm = l.mutate(); let i = lm.pop_back(); (Sized::List(l), i) } Sized::String(mut s) => s .pop() .map(|c| (Sized::String(s), Some(c.fit()))) .unwrap_or((Sized::String(String::new()), None)), Sized::Bytes(mut b) => b .pop() .map(|c| (Sized::Bytes(b), Some((c as Int).fit()))) .unwrap_or((Sized::Bytes(vec![]), None)), Sized::Set(mut s) => { let i = s.iter().next().cloned(); let sm = s.mutate(); if let Some(i) = i.clone() { sm.take(&i); } (Sized::Set(s), i.map(Item::derive)) } } } /// Puts an item into the container, at the end. pub fn put(self, other: Item) -> Result<Sized, Error> { match (self, other) { (Sized::List(mut c), i) => { c.mutate().push_back(i); Ok(Sized::List(c)) } (Sized::Associative(a), l) => Ok(Sized::Associative(a.put(l)?)), (Sized::Set(mut s), i) => { s.mutate().insert(assoc::KeyItem::try_derive(i)?); Ok(Sized::Set(s)) } (Sized::Bytes(mut b), Item::Number(Number::Int(i))) => { b.push(i as u8); Ok(Sized::Bytes(b)) } (Sized::Bytes(_), i) => Err(Error::expected("integer", i)), (Sized::String(mut s), Item::Char(c)) => Ok(Sized::String({ s.push(c); s })), (Sized::String(_), i) => Err(Error::expected("char", i)), } } /// Returns a new empty version of this container. Does not /// modify this container. The new container will be the same /// type as this one (if this is a [Sized::String], you'll get an empty /// [Sized::String], etc) pub fn empty(&self) -> Sized { match self { Sized::Associative(_) => { Sized::Associative(assoc::Associative::Assoc(assoc::Association::fresh())) } Sized::List(_) => Sized::List(List::default()), Sized::Set(_) => Sized::Set(Set::default()), Sized::String(_) => Sized::String(String::new()), Sized::Bytes(_) => Sized::Bytes(vec![]), } } } impl Receptacle { /// Puts the given [Item] into this container, items are added at /// the end. pub fn put(self, i: Item) -> Future<Result<Receptacle, Error>> { match self { Receptacle::Sized(s) => Box::pin(future::ready(s.put(i).map(Receptacle::Sized))), Receptacle::In(mut p) => Box::pin(p.put(i).map(|r| r.map(|_| Receptacle::In(p)))), Receptacle::Tunnel(mut t) => { let p = t.put(i); Box::pin(p.map(|r| r.map(|_| Receptacle::Tunnel(t)))) } } } } impl IntoIterator for Sized { type Item = Item; type IntoIter = Box<dyn Iterator<Item = Self::Item>>; fn into_iter(self) -> Self::IntoIter { match self { Sized::Associative(map) => Box::new(map.to_iter().map(|kv| kv.fit())), Sized::List(list) => { let items: Vec<_> = list.iter().cloned().collect(); Box::new(items.into_iter()) } Sized::String(s) => { let chars: Vec<char> = s.chars().collect(); Box::new(chars.into_iter().map(|c| c.fit())) } Sized::Bytes(b) => { let vec: Vec<Item> = b .into_iter() .map(|byte| Item::derive(byte as Int)) .collect(); Box::new(vec.into_iter()) } Sized::Set(s) => { let items: Vec<_> = s.iter().cloned().map(|i| i.fit()).collect(); Box::new(items.into_iter()) } } } } impl TryDerive<Dispenser> for Sized { type Error = Error; fn try_derive(c: Dispenser) -> Result<Self, Self::Error> { //println!("from iterable {:?}", c); match c { Dispenser::Sized(s) => Ok(s), i => Err(Error::expected("sized", i)), } } } impl TryDerive<Receptacle> for Sized { type Error = Error; fn try_derive(c: Receptacle) -> Result<Self, Self::Error> { match c { Receptacle::Sized(s) => Ok(s), i => Err(Error::expected("sized", Item::Receptacle(i))), } } } impl TryDerive<Sized> for List { type Error = Error; fn try_derive(s: Sized) -> Result<Self, Self::Error> { match s { Sized::List(l) => Ok(l), Sized::Associative(a) => Ok(List::derive_iter(a.to_iter().map(Item::derive))), i => Err(Error::expected("list", i)), } } } // Implement Derive for Item where T is an Iterator impl<T, I> Derive<T> for Item where T: ToIterator<Item = I> + IntoList, I: Fit<Item>, { fn derive(iter: T) -> Self { let l: List = iter.to_iter().map(Fit::fit).my_collect(); Item::derive(l) } } impl TryDerive<List> for Vec<dict::Namespace> { type Error = Error; fn try_derive(l: List) -> Result<Self, Self::Error> { l.iter().cloned().map(dict::Namespace::try_derive).collect() } } impl Derive<Vec<Item>> for List { fn derive(v: Vec<Item>) -> Self { List::derive_iter(v) } } impl Derive<String> for List { fn derive(s: String) -> Self { List::derive_iter(s.chars()) } } impl TryDerive<Item> for List { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match i { Item::Dispenser(l) => Sized::try_derive(l.clone()) .caused(Error::expected("list", l)) .and_then(List::try_derive), Item::Receptacle(l) => Sized::try_derive(l).and_then(List::try_derive), i => Err(Error::expected("list", i)), } } } impl TryDerive<Item> for Sized { type Error = Error; fn try_derive(item: Item) -> Result<Self, Self::Error> { match item { Item::Dispenser(c) => c.try_fit(), Item::Receptacle(p) => Dispenser::try_derive(p.clone()) .caused(Error::expected("sized", p))? .try_fit(), i => { // let bt = backtrace::Backtrace::new(); // println!("try from item {:?},\n {:?}", i, bt); Err(Error::expected("sized", i)) } } } } impl TryDerive<Item> for Receptacle { type Error = Error; fn try_derive(item: Item) -> Result<Self, Self::Error> { match item { Item::Receptacle(p) => Ok(p), Item::Dispenser(c) => c.try_fit(), i => Err(Error::expected("receptacle", i)), } } } impl TryDerive<Dispenser> for Receptacle { type Error = Error; fn try_derive(c: Dispenser) -> Result<Self, Self::Error> { match c { Dispenser::Sized(s) => Ok(Receptacle::Sized(s)), Dispenser::Tunnel(t) => Ok(Receptacle::Tunnel(t)), i => Err(Error::expected("receptacle", i)), } } } impl TryDerive<Receptacle> for Dispenser { type Error = Error; fn try_derive(c: Receptacle) -> Result<Self, Self::Error> { match c { Receptacle::Sized(s) => Ok(Dispenser::Sized(s)), Receptacle::Tunnel(t) => Ok(Dispenser::Tunnel(t)), i => Err(Error::expected("iterable", Item::Receptacle(i))), } } } impl TryDerive<Item> for Box<dyn Iterator<Item = Item>> { type Error = Error; fn try_derive(item: Item) -> Result<Self, Self::Error> { Ok(Sized::try_derive(item)?.into_iter()) } } impl Derive<Sized> for Box<dyn Iterator<Item = Item>> { fn derive(sized: Sized) -> Self { Box::new(sized.into_iter()) } } impl Derive<List> for Sized { fn derive(l: List) -> Self { Sized::List(l) } } impl Derive<String> for Sized { fn derive(s: String) -> Self { Sized::String(s) } } impl Derive<Bytes> for Sized { fn derive(b: Bytes) -> Self { Sized::Bytes(b) } } impl Derive<Sized> for Dispenser { fn derive(s: Sized) -> Self { Dispenser::Sized(s) } } impl Derive<List> for Item { fn derive(l: List) -> Self { Item::Dispenser(Dispenser::Sized(Sized::List(l))) } } impl Derive<Set> for Item { fn derive(l: Set) -> Self { Item::Dispenser(Dispenser::Sized(Sized::Set(l))) } } impl Derive<Dispenser> for Item { fn derive(c: Dispenser) -> Self { Item::Dispenser(c) } } impl Derive<Receptacle> for Item { fn derive(c: Receptacle) -> Self { Item::Receptacle(c) } } impl Derive<Sized> for Item { fn derive(s: Sized) -> Self { Dispenser::Sized(s).fit() } } impl TryDerive<Item> for Dispenser { type Error = Error; fn try_derive(item: Item) -> Result<Self, Self::Error> { match item { Item::Dispenser(c) => Ok(c), Item::Receptacle(p) => Ok(Dispenser::try_derive(p)?), i => Err(Error::expected("iterable", i)), } } } impl TryDerive<Item> for Set { type Error = Error; fn try_derive(item: Item) -> Result<Self, Self::Error> { let s = Sized::try_derive(item)?; let hs: HashSet<assoc::KeyItem> = s .into_iter() .map(|i| i.try_fit()) .collect::<Result<_, Error>>()?; Ok(Set::derive(hs)) } } impl Default for Sized { fn default() -> Self { Sized::List(List::default()) } } impl Default for Dispenser { fn default() -> Self { Dispenser::Sized(Sized::default()) } } impl Default for Receptacle { fn default() -> Self { Receptacle::Sized(Sized::default()) } } impl ToIterator for Sized { type Item = Item; type IntoIter = Box<dyn Iterator<Item = Item>>; fn to_iter<'a>(self) -> Self::IntoIter { self.into_iter() } } mod serde { use super::{Dispenser, Receptacle, Sized}; use crate::serialize::Display; use crate::traits::*; use crate::types::container::associative as assoc; use serde::ser::{Serialize, SerializeMap, SerializeSeq}; impl Serialize for Dispenser { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: serde::Serializer, { match self { Dispenser::Out(o) => o.representation().serialize(serializer), Dispenser::Tunnel(t) => t.representation().serialize(serializer), Dispenser::Sized(s) => s.serialize(serializer), } } } impl Serialize for Receptacle { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: serde::Serializer, { match self { Receptacle::In(i) => i.representation().serialize(serializer), Receptacle::Tunnel(t) => t.representation().serialize(serializer), Receptacle::Sized(s) => s.serialize(serializer), } } } impl Serialize for Sized { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: serde::Serializer, { match self { Sized::Associative(a) => { // Start serializing a map let assoc = assoc::Association::derive(a.clone()); let mut map = serializer.serialize_map(Some(assoc.len()))?; for (key, value) in assoc.iter() { // Serialize each entry in the map map.serialize_entry(&key, &value)?; } // Finish serializing the map map.end() } Sized::List(ref l) => { // Serialize a list (sequence) let mut seq = serializer.serialize_seq(Some(l.len()))?; for element in l.iter() { seq.serialize_element(&element)?; } seq.end() } Sized::Bytes(b) => serializer.serialize_bytes(b.as_slice()), Sized::Set(s) => { // Serialize a list (sequence) let mut seq = serializer.serialize_seq(Some(s.len()))?; for element in s.iter() { seq.serialize_element(&element)?; } seq.end() } Sized::String(s) => serializer.serialize_str(s.as_str()), } } } }
1.5.3.4. Associative types
//! Support for Associative data types (similar contract to Rust's //! HashMap). Includes specific runtime data types like Errors, //! Dictionaries, Environments, as well as generic maps (which are //! called "associations" in kcats) use super::{dictionary as dict, environment as env}; use crate::traits::*; use crate::types::container as coll; use crate::types::container::{Count, Join, Mutey}; use crate::types::number::{Int, Number}; use crate::types::*; use std::collections::HashSet; use std::convert::Infallible; use std::sync::{self, Arc}; pub type Associationy<K, V> = HashMap<K, V>; pub type AssociationContent = Associationy<KeyItem, Item>; pub type Association = coll::Arc<AssociationContent>; /// A KeyItem is all the Item types that can be used as a key in an /// Associative structure. In order to be a key, the type has to be /// hashable and have an ordering, so types like floating point /// numbers or sets can't be used. #[derive(Debug, Clone, Eq, PartialEq, Hash, PartialOrd, Ord)] pub enum KeyItem { // Order matters here, for comparison purposes - changing the // order will change the result of how eg int compares to word. Int(Int), Char(Char), Word(Word), Bytes(Bytes), String(String), List(KeyList), } /// An Entry is a single pairing in an Associative type pub type Entry = (KeyItem, Item); pub type KeyListContent = coll::Listy<KeyItem>; pub type KeyList = coll::Arc<KeyListContent>; impl TryDeriveIterator<Item> for KeyList { fn try_from_iter<I>(l: I) -> Result<Self, Error> where I: IntoIterator<Item = Item>, { Ok(sync::Arc::new( l.into_iter() .map(KeyItem::try_derive) .collect::<Result<VecDeque<KeyItem>, Error>>()?, )) } } /// An Associative is a container type that associates one Item (the /// key) with another (the value). It has the property where you can /// look up a value using the key, and you can update the value that a /// key points to. Some Item types cannot be used as keys, only /// [KeyItem] is accepted as an Associative key. #[derive(Debug, Clone)] pub enum Associative { /// A generic associative structure where you can associate any /// [KeyItem] with any [Item]. Assoc(Association), /// Represents an [dict::Dictionary] entry structure with /// specific keys. DictEntry(dict::Entry), /// Represents an execution environment, with specific keys Env(env::Environment), /// Represents a runtime Error value, with specific keys Error(Error), /// Represents the words available in to use Words(dict::Words), /// Represents a dictionary, including which modules have priority Dictionary(dict::Dictionary), Nothing, } impl Derive<KeyItem> for Item { fn derive(i: KeyItem) -> Self { match i { KeyItem::Int(i) => Item::Number(Number::Int(i)), KeyItem::String(i) => i.fit(), KeyItem::List(l) => coll::List::derive_iter(l.iter().cloned().map(Item::derive)).fit(), KeyItem::Word(w) => Item::Word(w), KeyItem::Bytes(bs) => bs.fit(), KeyItem::Char(c) => Item::Char(c), } } } impl Derive<&str> for KeyItem { fn derive(i: &str) -> Self { KeyItem::Word(Word::derive(i)) } } impl Derive<Word> for KeyItem { fn derive(i: Word) -> Self { KeyItem::Word(i) } } impl TryDerive<Item> for KeyItem { type Error = Error; fn try_derive(i: Item) -> Result<Self, Error> { match i { Item::Number(Number::Int(i)) => Ok(KeyItem::Int(i)), Item::Word(w) => Ok(KeyItem::Word(w)), Item::Char(c) => Ok(KeyItem::Char(c)), i => match coll::Sized::try_derive(i)? { coll::Sized::String(i) => Ok(KeyItem::String(i)), coll::Sized::Bytes(i) => Ok(KeyItem::Bytes(i)), coll::Sized::List(l) => { Ok(KeyItem::List(KeyList::try_from_iter(l.iter().cloned())?)) } s => { println!("Bad keyitem {:?}", s); Err(Error::expected("KeyItem", s)) } }, } } } impl TryDerive<KeyItem> for Word { type Error = Error; fn try_derive(k: KeyItem) -> Result<Self, Self::Error> { match k { KeyItem::Word(w) => Ok(w), KeyItem::String(s) => Ok(s.fit()), i => Err(Error::expected("word", i)), } } } impl PartialEq for Associative { fn eq(&self, other: &Self) -> bool { match (self, other) { (Associative::Assoc(a), Associative::Assoc(b)) => a == b, (Associative::DictEntry(a), Associative::DictEntry(b)) => a == b, (Associative::Env(a), Associative::Env(b)) => a == b, (Associative::Error(a), Associative::Error(b)) => a == b, (Associative::Dictionary(a), Associative::Dictionary(b)) => a == b, (Associative::Nothing, Associative::Nothing) => true, //(Associative::Assoc(a), b) => Association::derive(a) == Association::derive(b), //(a, Associative::Assoc(b)) => Association::derive(a) == Association::derive(b), _ => false, } } } impl coll::Join<coll::List> for Association { type Output = Association; type Error = Error; fn join(self, other: coll::List) -> Result<Self::Output, Self::Error> { //println!("Joining list to association"); let la = Association::try_from_iter(other.iter().cloned())?; Ok(self.join(Associative::Assoc(la)).unwrap()) } } // impl coll::Join<Association> for Associative { // type Output = Associative; // type Error = Infallible; // fn join(self, other: Association) -> Result<Self::Output, Error> { // let la = Association::try_from_iter(other.iter().cloned())?; // self.join(Associative::Assoc(la)) // } // } impl coll::Join<Associative> for Association { type Output = Association; type Error = Infallible; fn join(mut self, other: Associative) -> Result<Self::Output, Self::Error> { let thism = self.mutate(); thism.extend(other.to_iter()); Ok(self) } } /// The join operation is for generic containers, but we can join /// two Associatives by merging them together. If both /// Associatives are the same specific type, the type is /// preserved. If `other` can be converted to the same specific /// type as `self`, that conversion will be done and the specific /// type of `self` is preserved. If they are different types and we /// can't convert `other` to `self`s type, the result will be /// demoted to a more generic form. /// /// Keys in `other` have priority over those in `self` - if a key /// is in both containers, the result will have only the value /// from `other`. impl coll::Join<Associative> for Associative { type Output = Associative; type Error = Infallible; fn join(self, other: Associative) -> Result<Self::Output, <Self as Join<Associative>>::Error> { //println!("Joining associative to associative"); Ok(match (self, other) { // same type means 2nd one wins. //TODO: a little more complex for types that can be extended (Associative::DictEntry(_), Associative::DictEntry(other)) => { Associative::DictEntry(other) } (Associative::Dictionary(this), Associative::Dictionary(other)) => { Associative::Dictionary(this.join(other).unwrap()) } //(Associative::Dictionary(this), Associative::Assoc(other)) => {} (Associative::Words(this), Associative::Assoc(other)) => this.join(other).unwrap(), (Associative::Assoc(this), Associative::Words(other)) => this.join(other).unwrap(), (Associative::Error(_), Associative::Error(other)) => Associative::Error(other), (Associative::Env(_), Associative::Env(other)) => Associative::Env(other), (Associative::Nothing, Associative::Nothing) => Associative::Nothing, // This is infallible so .unwrap should be safe (Associative::Assoc(this), other) => Associative::Assoc(this.join(other).unwrap()), (this, other) => { unimplemented!("Join between associatives: {:?} \n\n{:?}", this, other) } }) } } impl coll::SimpleTake for Association { type Item = Entry; fn take_simple(&mut self) -> Option<Self::Item> { let maybe_key = self.keys().next().cloned(); let am = self.mutate(); let maybe_value = maybe_key.as_ref().and_then(|key| am.remove(key)); maybe_key.map(|key| (key, maybe_value.unwrap_or_default())) } } /// The take operation is for generic containers but we can /// perform it on an Associative by removing an arbitrary pair and /// returning it. impl coll::SimpleTake for Associative { type Item = Item; fn take_simple(&mut self) -> Option<Self::Item> { match self { Associative::Assoc(ref mut a) => a.take_simple().map(Item::derive), Associative::Words(ref mut d) => d.take_simple().map(Item::derive), // The remaining impls may require auto-demotion (eg, // removing a required field from say, Error). We'll just // demote all of them whether the field that is removed is // required or not, since the caller cannot know in // advance which it will be. ref a => { let mut assoc: Association = (*a).clone().fit(); let v = assoc.take_simple().map(Item::derive); *self = Associative::Assoc(assoc); v } } } } impl Associative { /// Retuns the number of associations in the container pub fn len(&self) -> usize { match self { Associative::Assoc(a) => a.len(), Associative::DictEntry(a) => a.len(), Associative::Env(e) => e.len(), Associative::Error(e) => e.len(), Associative::Words(d) => d.len(), Associative::Dictionary(d) => d.len(), Associative::Nothing => 0, } } /// Returns true if the container is empty pub fn is_empty(&self) -> bool { self.len() == 0 } /// Inserts a new association of a [KeyItem] to [Item]. If the key /// already exists, the value is replaced and the old value is /// returned. If the key doesn't exist, a new one is created with /// the new value and no old value is returned. The overall return /// value is tuple of an updated Associative, and an optional old /// value. /// /// The Associative returned is not necessarily the same type as /// self, as sometimes there is auto-demotion, eg from Error to /// Association. Demotion typically happens when you insert a key /// into a type that doesn't support that key, you'll get a more /// generic type back instead. pub fn insert(self, k: KeyItem, v: Item) -> (Associative, Option<Item>) { //println!("Insert! {:?}", self); match self { Associative::Assoc(mut a) => { let am = coll::Arc::mutate(&mut a); let e = am.insert(k, v); (Associative::Assoc(a), e) } Associative::Words(mut d) => match (k, v) { (KeyItem::Word(w), e) => { let e2 = e.clone(); if let Ok(e) = dict::Entry::try_derive(e) { let dm = coll::Arc::mutate(&mut d); let e = dm.insert(w.fit(), e).map(Item::derive); (Associative::Words(d), e) } else { // TODO silently failing to insert here is bad println!("Warning, failed to insert into dictionary: {:?}", e2); (Associative::Words(d), None) } } _ => (Associative::Words(d), None), }, Associative::Env(e) => e.insert(k, v), Associative::DictEntry(mut de) => match k { KeyItem::Word(ref w) => { let w: &str = w.fit(); if w == "definition" { let l = coll::List::try_derive(v); match l { Ok(l) => { de.definition = dict::Definition::Derived(l); (Associative::DictEntry(de), None) // TODO: return the old def } Err(_) => (Associative::DictEntry(de), None), } } else if w == "examples" { let l = coll::List::try_derive(v); match l { Ok(l) => { de.examples = Some(l); (Associative::DictEntry(de), None) // TODO: return the old examples } Err(_) => (Associative::DictEntry(de), None), } } else if w == "spec" { let l = coll::List::try_derive(v); match l { Ok(l) => { de.spec = l.try_fit().ok(); (Associative::DictEntry(de), None) // TODO: return the old spec } Err(_) => (Associative::DictEntry(de), None), } } else { (Associative::DictEntry(de), None) } } _ => (Associative::DictEntry(de), None), }, Associative::Dictionary(mut d) => match k { KeyItem::Word(ref w) => { let w: &str = w.fit(); if w == "words" { let e = dict::Words::try_derive(v); match e { Ok(words) => { d.words = words; (Associative::Dictionary(d), None) // TODO: return the old entries } Err(_) => (Associative::Dictionary(d), None), } } else if w == "modules" { let l = Vec::<dict::Namespace>::try_derive(v); match l { Ok(modules) => { d.modules = modules; (Associative::Dictionary(d), None) // TODO: return the old modules } Err(_) => (Associative::Dictionary(d), None), } } else { (Associative::Dictionary(d), None) } } _ => (Associative::Dictionary(d), None), }, _ => todo!("insert Implementations for error, dictionary, env etc"), } } /// The put operation is for generic containers, adding a new Item /// to the container. In the case of Associative, we can still do /// this if the Item is the right type: a key/value pair. If it's /// the right type, we [Self::insert] the value using the key, /// otherwise return an error. pub fn put(self, other: Item) -> Result<Associative, Error> { match (self, other) { (Associative::Words(mut this), other) => { let (word, entry) = <(dict::Word, dict::Entry)>::try_derive(other)?; let thismut = this.mutate(); thismut.insert(word.fit(), entry.fit()); Ok(Associative::Words(this)) } (this, other) => { let entry: (KeyItem, Item) = other.try_fit()?; Ok(this.insert(entry.0, entry.1).0) } } } /// Retrieves a value from the container using the key /// `k`. Returns [None] if the key is not present. pub fn get(&self, k: &KeyItem) -> Option<Item> { match self { Associative::Assoc(a) => a.get(k).cloned(), Associative::Dictionary(d) => d.get(k), Associative::Error(e) => e.data.get(k).cloned(), Associative::Env(e) => e.get(k), Associative::DictEntry(d) => d.get(k), Associative::Words(d) => match k { KeyItem::Word(w) => d.get(&w.clone().fit()).map(|x| x.clone().fit()), _ => None, }, &Associative::Nothing => None, } } /// Returns true if the key `k` is present in the container. pub fn contains_key(&self, k: &KeyItem) -> bool { match self { Associative::Assoc(a) => a.contains_key(k), Associative::Error(e) => e.data.contains_key(k), Associative::Env(e) => e.contains_key(k), Associative::DictEntry(d) => d.contains_key(k), Associative::Dictionary(d) => d.contains_key(k), Associative::Words(d) => match k { KeyItem::Word(w) => d.contains_key(&w.clone().fit()), _ => false, }, &Associative::Nothing => false, } } /// Removes the key `k` from the container, returning a tuple of a /// new [Associative] and an optional value if the key was /// present. pub fn remove(self, k: &KeyItem) -> (Associative, Option<Item>) { match self { Associative::Assoc(mut a) => { let am = coll::Arc::mutate(&mut a); let v = am.remove(k); (Associative::Assoc(a), v) } Associative::Words(mut d) => { let dm = coll::Arc::mutate(&mut d); let v = dm.remove(&dict::Word::try_derive(k.clone()).unwrap_or_default()); (Associative::Words(d), v.map(|v| v.fit())) } Associative::Error(mut e) => { let a = e.data.mutate(); let v = a.remove(k); (Associative::Error(e), v) } Associative::Env(e) => { let a = Association::derive_iter(e); Associative::Assoc(a).remove(k) } _ => todo!("Removing from other associative types"), } } } impl ToIterator for Associative { type Item = Entry; type IntoIter = Box<dyn Iterator<Item = Entry>>; fn to_iter<'a>(self) -> Self::IntoIter { match self { Associative::Assoc(a) => { let items: Vec<_> = a.iter().map(|(k, v)| (k.clone(), v.clone())).collect(); Box::new(items.into_iter()) } Associative::DictEntry(e) => Box::new(e.into_iter()), Associative::Dictionary(d) => Box::new(d.into_iter()), Associative::Words(d) => { let items: Vec<_> = d .iter() .map(|(k, v)| (KeyItem::Word(k.clone().fit()), v.clone().fit())) .collect(); Box::new(items.into_iter()) } Associative::Error(e) => e.into_iter(), Associative::Env(e) => e.into_iter(), Associative::Nothing => Box::new(std::iter::empty()), } } } impl Derive<Associative> for coll::List { fn derive(a: Associative) -> Self { coll::List::derive_iter(a.to_iter()) } } impl TryDerive<coll::Sized> for Associative { type Error = Error; fn try_derive(s: coll::Sized) -> Result<Self, Error> { match s { coll::Sized::Associative(a) => Ok(a), coll::Sized::String(i) => Err(Error::expected("associative", i)), coll::Sized::Bytes(i) => Err(Error::expected("associative", i)), s => Ok(Associative::Assoc(Association::try_from_iter(s)?)), } } } impl TryDerive<Item> for Associative { type Error = Error; fn try_derive(i: Item) -> Result<Self, Error> { let s = coll::Sized::try_derive(i)?; Associative::try_derive(s) } } // Convert anything that can be iterated over as Items, to an // Association. The items must be pairs that are // convertable to Entry, otherwise it will return an error. impl TryDeriveIterator<Item> for Association { fn try_from_iter<I>(l: I) -> Result<Self, Error> where I: IntoIterator<Item = Item>, { Ok(sync::Arc::new( l.into_iter() .map(|i| Entry::try_derive(i.clone())) .collect::<Result<HashMap<KeyItem, Item>, Error>>()?, )) } } impl Derive<HashMap<KeyItem, Item>> for Association { fn derive(h: HashMap<KeyItem, Item>) -> Self { sync::Arc::new(h) } } impl DeriveIterator<Entry> for Association { fn derive_iter<I>(iter: I) -> Self where I: IntoIterator<Item = Entry>, { sync::Arc::new(iter.into_iter().collect::<HashMap<KeyItem, Item>>()) } } impl DeriveIterator<Entry> for coll::List { fn derive_iter<I>(iter: I) -> Self where I: IntoIterator<Item = Entry>, { coll::Arc::new( iter.into_iter() .map(|e| e.fit()) .collect::<VecDeque<Item>>(), ) } } impl DeriveIterator<KeyItem> for KeyList { fn derive_iter<I>(iter: I) -> Self where I: IntoIterator<Item = KeyItem>, { sync::Arc::new(iter.into_iter().collect::<VecDeque<KeyItem>>()) } } impl Derive<Entry> for Item { fn derive(e: Entry) -> Item { coll::List::derive_iter([Item::derive(e.0), e.1]).fit() } } impl TryDerive<Item> for Entry { type Error = Error; fn try_derive(i: Item) -> Result<Self, Error> { let s = coll::Sized::try_derive(i)?; if s.count() != 2 { Err(Error::expected("pair", s)) } else { let mut iter = s.into_iter(); let key: KeyItem = iter.next().unwrap().try_fit()?; let value = iter.next().unwrap(); Ok((key, value)) } } } impl Derive<Associative> for Association { fn derive(a: Associative) -> Association { match a { Associative::Assoc(a) => a, a => Association::derive_iter(a.to_iter()), } } } impl Derive<AssociationContent> for Item { fn derive(a: AssociationContent) -> Item { sync::Arc::new(a).fit() } } impl Derive<Association> for Item { fn derive(a: Association) -> Item { Associative::Assoc(a).fit() } } impl Derive<Associative> for Item { fn derive(a: Associative) -> Item { coll::Sized::Associative(a).fit() } } impl Derive<(KeyItem, Item)> for KeyItem { fn derive((k, _): (KeyItem, Item)) -> KeyItem { k } } /// Converting Associative to Set just returns the keys. impl Derive<Associative> for coll::Set { fn derive(a: Associative) -> coll::Set { Arc::new(HashSet::from_iter(a.to_iter().map(|(k, _)| k))) } } impl<T, E, C> DeriveIterator<Result<T, E>> for Result<C, E> where C: DeriveIterator<T>, { fn derive_iter<I: IntoIterator<Item = Result<T, E>>>(iter: I) -> Self { let mut result = Vec::new(); for item in iter { match item { Ok(value) => result.push(value), Err(e) => return Err(e), } } Ok(C::derive_iter(result)) } } pub trait Convert<KA, VA> { /// Convert from any type of hashmap to any other, assuming the keys /// and values convert fn convert<KB, VB>(&self) -> Result<HashMap<KB, VB>, Error> where KB: Clone + Eq + Hash + TryDerive<KA, Error = Error>, VB: Clone + TryDerive<VA, Error = Error>, KA: Clone + Eq + Hash, // Assuming Clone is needed for TryFrom VA: Clone; } impl<KA, VA> Convert<KA, VA> for HashMap<KA, VA> where KA: Eq + Hash + Clone, VA: Clone, { fn convert<KB, VB>(&self) -> Result<HashMap<KB, VB>, Error> where KB: Clone + Eq + Hash + TryDerive<KA, Error = Error>, VB: Clone + TryDerive<VA, Error = Error>, KA: Clone + Eq + Hash, // Assuming Clone is needed for TryFrom VA: Clone, { let mut new_hashmap = HashMap::new(); for (key, value) in self.iter().map(|(k, v)| (k.clone(), v.clone())) { let new_key: KB = key.try_fit()?; let new_value: VB = value.try_fit()?; new_hashmap.insert(new_key, new_value); } Ok(new_hashmap) } } mod serde { use super::{KeyItem, KeyList}; use crate::traits::*; use serde::de::{self, Deserialize, Deserializer, Visitor}; use serde::ser::{Serialize, SerializeSeq}; use std::fmt; struct KeyItemVisitor; impl<'de> Visitor<'de> for KeyItemVisitor { type Value = KeyItem; fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result { formatter.write_str("expected a specific representation for Item") } fn visit_i64<E>(self, value: i64) -> Result<Self::Value, E> where E: de::Error, { Ok(KeyItem::Int(value)) } fn visit_u64<E>(self, value: u64) -> Result<Self::Value, E> where E: de::Error, { Ok(KeyItem::Int(value as i64)) } fn visit_str<E>(self, v: &str) -> Result<Self::Value, E> where E: de::Error, { Ok(KeyItem::String(v.to_string())) } fn visit_byte_buf<E>(self, v: Vec<u8>) -> Result<Self::Value, E> where E: de::Error, { Ok(KeyItem::Bytes(v)) } fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error> where A: de::SeqAccess<'de>, { let mut items: Vec<KeyItem> = Vec::new(); while let Some(item) = seq.next_element::<KeyItem>()? { items.push(item); } Ok(KeyItem::List(KeyList::derive_iter(items))) } } impl<'de> Deserialize<'de> for KeyItem { fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> where D: Deserializer<'de>, { deserializer.deserialize_any(KeyItemVisitor) } } impl Serialize for KeyItem { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: serde::Serializer, { match self { KeyItem::Int(i) => serializer.serialize_i64(*i), KeyItem::Word(w) => serializer.serialize_str(w.fit()), KeyItem::Char(c) => serializer.serialize_char(*c), KeyItem::Bytes(b) => serializer.serialize_bytes(b.as_slice()), KeyItem::List(ref l) => { // Serialize a list (sequence) let mut seq = serializer.serialize_seq(Some(l.len()))?; for element in l.iter() { seq.serialize_element(&element)?; } seq.end() } KeyItem::String(s) => serializer.serialize_str(s.as_str()), } } } }
1.5.3.5. Error types
use super::associative as assoc; use crate::list; use crate::traits::*; use crate::types::container::{self as coll, Mutey}; use crate::types::number::Int; use crate::types::{Item, Word}; use std::convert::Infallible; /// Represents a runtime error type. Contains generic fields to hold /// things like what type of error, the actual vs expected conditions, /// etc. Also holds whether the error has been handled or not, which /// the runtime uses to decide whether to keep unwinding the program /// looking for something to handle the error. An error that has been /// handled is inert, it is just another data value. #[derive(Clone, PartialEq)] pub struct Error { pub data: assoc::Association, pub is_handled: bool, } pub trait Nested { fn caused(self, other: Error) -> Self; } impl<T, E> Nested for Result<T, E> where E: Nested, { fn caused(self, other: Error) -> Self { self.map_err(|e| e.caused(other)) } } impl Nested for Error { fn caused(self, mut e: Error) -> Error { e.data.mutate().insert("cause".fit(), self.fit()); e } } impl Error { /// Creates a new error. pub fn create<T: Fit<Item>>(asked: coll::List, reason: &str, actual: Option<T>) -> Error { // let bt = backtrace::Backtrace::new(); let mut data: Vec<(assoc::KeyItem, Item)> = vec![ ("type".fit(), "error".fit()), ("asked".fit(), asked.fit()), ("reason".fit(), reason.to_string().fit()), //("backtrace".fit(), Item::String(format!("{:?}", bt))), ]; if let Some(actual) = actual { data.push(("actual".fit(), actual.fit())); } Error { is_handled: false, data: assoc::Association::derive_iter(data), } } /// Creates a stack underflow error for when the current word /// needs more items than there are on the stack. pub fn stack_underflow() -> Error { Error::create( list!("consume"), "not enough items on stack", Option::<Item>::None, ) } pub fn overflow() -> Error { Error::create(list!("arithmetic"), "number overflow", Option::<Item>::None) } pub fn undefined(w: Word) -> Error { Error::create(list!(w), "word is not defined", Option::<Item>::None) } pub fn type_mismatch<T: Fit<Item>>(asked: coll::List, actual: Option<T>) -> Error { Error::create(asked, "type mismatch", actual) } pub fn division_by_zero() -> Error { Error::create(list!("/"), "division by zero", Option::<Item>::None) } pub fn expected<T: Fit<Item>>(typestr: &str, actual: T) -> Error { Error::type_mismatch(list!(typestr), Some(actual)) } pub fn short_list(expected: Int) -> Error { Error::create( list!("count", expected, ">="), "list had too few items", Option::<Item>::None, ) } pub fn list_count(expected: Int) -> Error { Error::create( list!("count", expected, "="), "list had wrong number of items", Option::<Item>::None, ) } pub fn negative(actual: Int) -> Error { Error::too_small(actual, 0) } pub fn too_small(actual: Int, expected: Int) -> Error { Error::create(list!(expected, ">="), "number too small", Some(actual)) } pub fn too_large(actual: Int, expected: Int) -> Error { Error::create(list!(expected, "<="), "number too large", Some(actual)) } pub fn parse(reason: &str) -> Error { Error::create(list!("read"), reason, Option::<Item>::None) } pub fn test_assertion(program: coll::List, expected: coll::List, actual: coll::List) -> Error { let mut e = Error::create(program, "assertion failed", Some(actual)); let d = e.data.mutate(); d.insert("expected-program".fit(), expected.fit()); e } pub fn len(&self) -> usize { self.data.len() } pub fn push(&mut self, key: assoc::KeyItem, value: Item) -> Option<Item> { self.data.mutate().insert(key, value) } } impl Derive<Infallible> for Error { fn derive(_x: Infallible) -> Self { match _x {} // Since Infallible can never be instantiated, this will never run } } impl Derive<Infallible> for Item { fn derive(_x: Infallible) -> Self { match _x {} // Since Infallible can never be instantiated, this will never run } } impl Derive<Error> for assoc::Association { fn derive(e: Error) -> assoc::Association { e.data } } impl TryDerive<Item> for Error { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match i { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Error(e), ))) => Ok(e), Item::Dispenser(coll::Dispenser::Sized(coll::Sized::String(_))) | Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::String(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(_))) => { Err(Error::expected("error", Item::default())) } Item::Dispenser(coll::Dispenser::Sized(c)) => c.into_iter().try_fit(), i => Err(Error::expected("error", i)), } } } impl TryDerive<Box<dyn Iterator<Item = Item>>> for Error { type Error = Error; fn try_derive(i: Box<dyn Iterator<Item = Item>>) -> Result<Self, Self::Error> { //TODO: this can't fail, can just be a From. // Really though, Error should have predefined fields like Environment. let data = assoc::Association::try_from_iter(i)?; Ok(Error { data, is_handled: false, }) } } impl TryDerive<assoc::Associative> for Error { type Error = Error; fn try_derive(a: assoc::Associative) -> Result<Self, Self::Error> { match a { assoc::Associative::Error(e) => Ok(e), assoc::Associative::Assoc(a) => { if a.get(&assoc::KeyItem::derive("type")) != Some(&Item::derive("error")) { Err(Error::expected("error", a)) } else { Ok(Error { data: a.clone(), is_handled: true, }) } } i => Err(Error::expected("error", i)), } } } impl Derive<Error> for Item { fn derive(e: Error) -> Item { assoc::Associative::Error(e).fit() } } impl IntoIterator for Error { type Item = assoc::Entry; type IntoIter = Box<dyn Iterator<Item = assoc::Entry>>; fn into_iter(self) -> Self::IntoIter { let items: Vec<_> = self .data .iter() .map(|(k, v)| (k.clone(), v.clone())) .chain(std::iter::once(("handled".fit(), self.is_handled.fit()))) .collect(); Box::new(items.into_iter()) } }
1.5.3.6. Dictionary types
use super::associative as assoc; use crate::axiom::BUILTIN_FUNCTIONS; use crate::list; use crate::serialize; use crate::traits::*; use crate::types::container::associative::Convert; use crate::types::container::{self as coll, Count, Mutey}; use crate::types::{self, Bytes, Error, Item}; use core::fmt; use internment::Intern; use std::collections::HashMap; use std::collections::HashSet; use std::convert::Infallible; use std::hash::Hash; use std::ptr; use std::sync::Arc; /// A word in a dictionary is slightly different than a 'word' piece /// of data: when looking up words in the dictionary, the namespace is /// used for comparison. Due to differences in equality checking, we /// use a wrapper type here so we can distinguish the behavior. #[derive(Eq, Debug, Clone, Default, PartialEq, Hash)] pub struct Word(pub types::Word, pub Namespace); /// Easily convert from [Word] impl Derive<Word> for types::Word { fn derive(w: Word) -> Self { w.0 } } /// Easily convert from [types::Word] impl Derive<types::Word> for Word { fn derive(w: types::Word) -> Self { Word(w, None) } } impl Derive<&str> for Word { fn derive(w: &str) -> Self { Word::derive(types::Word::derive(w)) } } impl TryDerive<assoc::KeyItem> for Word { type Error = Error; fn try_derive(k: assoc::KeyItem) -> Result<Self, Self::Error> { Ok(types::Word::try_derive(k)?.fit()) } } /// The definition of a [Word], contains its actual code (the /// definition), and also documentation like specs and examples. #[derive(Debug, Clone, PartialEq)] pub struct Entry { pub examples: Option<coll::List>, pub spec: Option<Spec>, pub definition: Definition, pub namespace: Namespace, } impl Eq for Entry {} // TODO: move specs to their own module /// An element of a [Spec], either an input or an output. Holds the /// type and optional name of the input/output. #[derive(Debug, Clone, PartialEq)] pub struct SpecElement { pub elemtype: types::Word, pub name: Option<types::Word>, } pub type StackSpec = Vec<SpecElement>; /// The spec of a [Word] consists of the input spec and the output /// spec, that shows what the stack should look like before and after /// the [Word] is invoked. pub type Spec = (StackSpec, StackSpec); impl TryDerive<Item> for SpecElement { type Error = Error; fn try_derive(i: Item) -> Result<SpecElement, Error> { match i { Item::Word(w) => Ok(SpecElement { elemtype: w, name: None, }), i => { let s = coll::List::try_derive(i)?; if s.len() != 2 { Err(Error::list_count(2)) } else { let t = types::Word::try_derive(s.front().unwrap().clone())?; let n = types::Word::try_derive(s.get(1).unwrap().clone())?; Ok(SpecElement { elemtype: t, name: Some(n), }) } } } } } impl TryDerive<coll::List> for StackSpec { type Error = Error; fn try_derive(s: coll::List) -> Result<StackSpec, Error> { s.iter() .cloned() .map(SpecElement::try_derive) //.map(|r| r.and_then(SpecElement::try_derive)) .collect::<Result<StackSpec, Error>>() } } impl TryDerive<coll::List> for Spec { type Error = Error; fn try_derive(s: coll::List) -> Result<Spec, Error> { if s.len() != 2 { Err(Error::list_count(2)) } else { Ok(( StackSpec::try_derive(coll::List::try_derive(s.front().unwrap().clone())?)?, StackSpec::try_derive(coll::List::try_derive(s.get(1).unwrap().clone())?)?, )) } } } impl TryDerive<Item> for Spec { type Error = Error; fn try_derive(i: Item) -> Result<Spec, Error> { Spec::try_derive(coll::List::try_derive(i)?) } } impl Derive<SpecElement> for Item { fn derive(se: SpecElement) -> Item { if se.name.is_some() { list!(se.elemtype, se.name).fit() } else { Item::Word(se.elemtype) } } } impl Derive<Spec> for Item { fn derive(s: Spec) -> Item { list!(s.0, s.1).fit() } } impl ToIterator for Vec<SpecElement> { type Item = SpecElement; type IntoIter = std::vec::IntoIter<SpecElement>; fn to_iter(self) -> Self::IntoIter { self.into_iter() } } impl ToIterator for Vec<Namespace> { type Item = Namespace; type IntoIter = std::vec::IntoIter<Namespace>; fn to_iter(self) -> Self::IntoIter { self.into_iter() } } impl Derive<Namespace> for Item { fn derive(ns: Namespace) -> Item { match ns { Some(ns) => (*ns).clone().fit(), None => Item::default(), } } } impl IntoList for Vec<Namespace> {} impl IntoList for Vec<SpecElement> {} //impl Derive<Intern<Vec<u8>>> for impl Entry { pub fn len(&self) -> usize { 3 // 3 fields } pub fn get(&self, key: &assoc::KeyItem) -> Option<Item> { match key { assoc::KeyItem::Word(w) => match w.data.as_str() { "spec" => self.spec.clone().map(|x| x.fit()), "examples" => self.examples.clone().map(|x| x.fit()), "definition" => Some(match self.definition.clone() { Definition::Axiom(_) => "builtin".fit(), Definition::Derived(d) => d.fit(), }), _ => None, }, _ => None, } } pub fn contains_key(&self, key: &assoc::KeyItem) -> bool { types::Word::try_derive(key.clone()).map_or(false, |ref w| { matches!(w.fit(), "examples" | "spec" | "definition") }) } } // TODO: Use the builtin Bytes type pub type Namespace = Option<Intern<Vec<u8>>>; pub fn bytes_to_ns(b: Bytes) -> Namespace { if b.is_empty() { Default::default() } else { Some(Intern::new(b)) } } impl TryDerive<Item> for Namespace { type Error = Error; fn try_derive(i: Item) -> Result<Namespace, Error> { match i { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(b))) => Ok(bytes_to_ns(b)), Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(b))) => Ok(bytes_to_ns(b)), i => { let s = coll::Sized::try_derive(i)?; if s.is_empty() { Ok(Default::default()) } else { Err(Error::expected("namespace", s)) } } } } } /// Holds [Word]s and their definitions. pub type Words = coll::Arc<HashMap<Word, Entry>>; /// Words are looked up from the Cache, which takes modules into /// account. pub type Cache = HashMap<types::Word, Entry>; /// One of the main components of an /// [crate::types::container::environment::Environment]. Provides /// definitions of words and a resolver, which decides which /// definition of the same word to use (based on which module it comes /// from). #[derive(Clone, PartialEq)] pub struct Dictionary { pub words: Words, pub cache: Cache, pub modules: Vec<Namespace>, } /// A custom impl for Dictionary that doesn't dump a massive data /// structure. comment this out to get access to the full debug output. impl fmt::Debug for Dictionary { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { f.debug_struct("Dictionary") .field("words", &format_args!("Words(len={})", self.words.len())) .field("modules", &self.modules) .finish() } } impl Dictionary { /// Treats the [Dictionary] as an associative structure, /// returning one of its fields, or [None]. pub fn get(&self, key: &assoc::KeyItem) -> Option<Item> { match key { assoc::KeyItem::Word(w) => match w.data.as_str() { "words" => Some(self.words.clone().fit()), "modules" => Some(self.modules.clone().fit()), _ => None, }, _ => None, } } /// Get an [Entry] from the dictionary, doing namespace /// resolution. pub fn get_entry(&self, key: &types::Word) -> Option<Entry> { self.cache.get(key).cloned() } pub fn len(&self) -> usize { 2 } pub fn merge(&mut self, new: Self, namespace: &Namespace) { self.words.merge(new.words, namespace); //self.resolve(); } pub fn contains_key(&self, key: &assoc::KeyItem) -> bool { types::Word::try_derive(key.clone()) .map_or(false, |ref w| matches!(w.fit(), "modules" | "words")) } /// Produce an [Words] map that is pre-resolved using the /// modules from this dictionary. Saves computation at runtime /// because resolution is already done. pub fn resolve(&mut self) { fn group_by_namespace(words: &Words) -> HashMap<Namespace, Vec<(types::Word, Entry)>> { let mut grouped: HashMap<Namespace, Vec<(types::Word, Entry)>> = HashMap::new(); for (k, v) in words.iter() { grouped .entry(k.1) .or_insert_with(Vec::new) .push((k.0.clone(), v.clone())); } grouped } let by_ns = group_by_namespace(&self.words); //println!("by namespace: {:?}", by_ns); let mut cache = Cache::new(); // first all the non-namespaced core words that can be overridden cache.extend(by_ns.get(&None).cloned().unwrap_or_default()); for module in self.modules.iter() { cache.extend(by_ns.get(module).cloned().unwrap_or_default()) } // println!( // "Cache now selected {} words. modules: {:?}", // cache.len(), // by_ns.keys() // ); //println!("resolve: contains? {:?}", cache.get(&"contains?".fit())); self.cache = cache; //println!("After resolve: {:?}", self); } } pub trait Dict { /// Returns the difference between this dictionary and a "newer" /// one: The additions/updates, and the deletions. fn diff(&self, newer: Words) -> (Vec<(Word, Entry)>, Vec<Word>); /// Takes a core module (in string form - should contain a series /// of word definitions, not wrapped in a single list), and /// inserts all the definitions into the dictionary, with an /// optional namespace. fn insert_core_module(&mut self, lexicon: String) -> Result<(), Error>; /// For stdlib words that are both built-in and part of a module /// that isn't necessarily loaded as part of the standard /// environment, we need to be able to link the word to its rust /// definition. Leaves other fields as None to be filled in later. fn builtins() -> Self; /// Merges this dictionary with the given new dictionary. The new /// words are added with the given namespace. fn merge(&mut self, new: Words, namespace: &Namespace); } impl coll::SimpleTake for Words { type Item = (Word, Entry); fn take_simple(&mut self) -> Option<Self::Item> { if let Some(ref k) = self.keys().next().cloned() { let dm = self.mutate(); let v = dm.remove(k).unwrap(); Some((k.clone(), v)) } else { None } } } impl Dict for Words { fn diff(&self, newer: Words) -> (Vec<(Word, Entry)>, Vec<Word>) { diff_hashmaps(self, &newer) } fn merge(&mut self, new: Words, namespace: &Namespace) { let (adds, deletes) = self.diff(new); // add namepaces to the adds and deletes //println!("Merge {} adds, {} deletes", adds.len(), deletes.len()); let adds: Vec<_> = adds .into_iter() .map(|(mut w, e)| { w.1 = *namespace; //println!("Adding {:?}", w); (w, e) }) .collect(); let deletes: Vec<_> = deletes .into_iter() .map(|mut w| { w.1 = *namespace; w }) .collect(); let d = self.mutate(); d.extend(adds); d.extend(make_deletes(deletes)); //println!("Contains: {:?}", d.get(&"contains?".fit())) } fn insert_core_module(&mut self, lexicon: String) -> Result<(), Error> { //println!("Parsing: {}", lexicon); let items = serialize::parse(lexicon)?; for r in Box::new(items.iter().cloned()) { let (k, def): (assoc::KeyItem, Item) = r.try_fit().unwrap(); let word: Word = k.try_fit().unwrap(); let iter: Box<dyn Iterator<Item = Item>> = def.try_fit().unwrap(); let new_entry: Entry = iter.try_fit().unwrap(); let new_entry2 = new_entry.clone(); let dict = self.mutate(); dict.entry(word) .and_modify(|e| { e.examples = new_entry.examples; e.spec = new_entry.spec; // Don't overwrite the definition, this should be // an axiom word where we've left the // spec/examples temporarily blank and we're // filling them in now that we've read the // lexicon. The definition is the builtin and we // want to keep that. //e.definition = new_entry.definition; }) .or_insert(new_entry2); } Ok(()) } fn builtins() -> Self { let mut dict = HashMap::new(); for (bw, bd) in BUILTIN_FUNCTIONS.iter() { let entry = Entry { definition: bd.clone(), examples: None, spec: None, namespace: None, }; dict.insert(Word::derive(bw.clone()), entry); } Arc::new(dict) } } /// Each word should run a program that calls fail (already namespaced /// to the stdlib so that the word acts like it isn't in the dictionary /// even though it is.) fn make_deletes(words: Vec<Word>) -> Vec<(Word, Entry)> { words .into_iter() .map(|word| { //println!("Shadowing word: {:?}", word); let err = Error::create( list!(types::Word::derive(word.clone())), "word removed by module", Some("access-denied"), ); let entry = Entry { examples: None, spec: None, definition: Definition::Derived(list!(err, "fail")), namespace: word.1, }; (word, entry) }) .collect() } /// Returns an owned pair given a pair of references fn owned<T: Clone, U: Clone>(entry: (&T, &U)) -> (T, U) { (entry.0.clone(), entry.1.clone()) } /// Returns the differences between two hashmaps, including the keys /// that have been added or changed (including the new values), and /// the keys that were deleted. fn diff_hashmaps<K, V>(a: &HashMap<K, V>, b: &HashMap<K, V>) -> (Vec<(K, V)>, Vec<K>) where K: Eq + Hash + Clone, V: PartialEq + Clone, { let a_keys: HashSet<K> = a.keys().cloned().collect(); let b_keys: HashSet<K> = b.keys().cloned().collect(); // Keys that are in `b` but not in `a` or have updated values in `b` let added_or_updated: Vec<(K, V)> = b .iter() .filter(|(k, v)| !a_keys.contains(k) || a.get(k) != Some(v)) .map(owned) .collect(); // Keys that are in `a` but not in `b` let deleted: Vec<K> = a_keys.difference(&b_keys).cloned().collect(); (added_or_updated, deleted) } impl coll::Join<Words> for Words { type Output = Words; type Error = Infallible; fn join(mut self, other: Words) -> Result<Self::Output, Self::Error> { let sm = self.mutate(); sm.extend(other.iter().map(owned)); Ok(self) } } impl coll::Join<Dictionary> for Dictionary { type Output = Dictionary; type Error = Infallible; fn join(mut self, other: Self) -> Result<Self::Output, Self::Error> { self.words = self.words.join(other.words)?; Ok(self) } } impl coll::Join<assoc::Association> for Words { type Output = assoc::Associative; type Error = Infallible; fn join(mut self, other: assoc::Association) -> Result<Self::Output, Self::Error> { // Try to convert to dictionary type //println!("dict + assoc join"); match other.convert::<Word, Entry>() { Ok(d) => { let tm = self.mutate(); tm.extend(d); Ok(assoc::Associative::Words(self)) } // TODO: convert the other way (to assoc) instead Err(_) => { //println!("Conversion error: {:?}", e); Ok(assoc::Associative::Words(self)) } } } } impl coll::Join<Words> for assoc::Association { type Output = assoc::Associative; type Error = Infallible; fn join(self, mut other: Words) -> Result<Self::Output, Self::Error> { // Try to convert to dictionary type //println!("assoc + dict join"); Ok(match self.convert::<Word, Entry>() { Ok(d) => { let tm = other.mutate(); tm.extend(d.iter().map(owned)); assoc::Associative::Words(other) } // TODO: convert the other way (to assoc) instead Err(_) => assoc::Associative::Words(other), }) } } /// The actual code for what a [Word] should do. #[derive(Clone)] pub enum Definition { /// A definition in the base language - a rust function that /// modifies the environment. Axiom(&'static types::StepFn), /// A definition in terms of other [Word]s - a kcats program Derived(coll::List), } // dictionary words are equal if they have the same function reference, // no need to compare the function values impl PartialEq for Definition { fn eq(&self, other: &Self) -> bool { match (self, other) { (Definition::Axiom(s), Definition::Axiom(o)) => ptr::eq(*s, *o), (Definition::Derived(s), Definition::Derived(o)) => s == o, _ => false, } } } impl fmt::Debug for Definition { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match self { Definition::Axiom(_) => f.write_str("Builtin"), Definition::Derived(d) => { let mut ds = f.debug_list(); ds.entries(d.iter()); ds.finish() } } } } impl IntoIterator for Entry { type Item = assoc::Entry; type IntoIter = Box<dyn Iterator<Item = assoc::Entry>>; fn into_iter(self) -> Self::IntoIter { let mut v: Vec<(assoc::KeyItem, Item)> = vec![("definition".fit(), { match self.definition { Definition::Derived(l) => l.fit(), Definition::Axiom(_) => "builtin-function".fit(), } })]; if let Some(e) = self.examples { v.push(("examples".fit(), e.fit())); } if let Some(s) = self.spec { v.push(("spec".fit(), s.fit())) } Box::new(v.into_iter()) } } impl IntoIterator for Dictionary { type Item = assoc::Entry; type IntoIter = Box<dyn Iterator<Item = assoc::Entry>>; fn into_iter(self) -> Self::IntoIter { let v: Vec<(assoc::KeyItem, Item)> = vec![ ("words".fit(), self.words.fit()), ("modules".fit(), self.modules.fit()), ]; Box::new(v.into_iter()) } } impl TryDerive<Box<dyn Iterator<Item = Item>>> for Entry { type Error = Error; fn try_derive(iter: Box<dyn Iterator<Item = Item>>) -> Result<Self, Error> { let mut examples: Option<coll::List> = None; let mut definition: Option<Definition> = None; let mut spec: Option<Spec> = None; let mut namespace: Namespace = None; for i in iter { let (k, v): (assoc::KeyItem, Item) = i.try_fit()?; //println!("k: {:?}, v: {:?}", k, v); if k == "examples".fit() { examples = Some(v.try_fit()?); } else if k == "definition".fit() { definition = Some(v.try_fit()?); } else if k == "spec".fit() { spec = v.try_fit().ok(); } else if k == "namespace".fit() { namespace = v.try_fit().unwrap_or_default(); } else { continue; } } Ok(Entry { examples, definition: definition.unwrap_or(Definition::Derived(coll::List::default())), spec, namespace, }) } } impl TryDerive<Box<dyn Iterator<Item = Item>>> for Words { type Error = Error; fn try_derive(iter: Box<dyn Iterator<Item = Item>>) -> Result<Self, Error> { iter.map(<(Word, Entry)>::try_derive) .collect::<Result<HashMap<Word, Entry>, Error>>() .map(Arc::new) } } impl TryDerive<Box<dyn Iterator<Item = Item>>> for Dictionary { type Error = Error; fn try_derive(iter: Box<dyn Iterator<Item = Item>>) -> Result<Self, Error> { let mut words = Words::default(); let mut modules = Vec::<Namespace>::default(); for i in iter { let (k, v): (assoc::KeyItem, Item) = i.try_fit()?; //println!("k: {:?}, v: {:?}", k, v); if k == "words".fit() { words = v.try_fit()?; } else if k == "modules".fit() { modules = v.try_fit()?; } else { continue; } } Ok(Dictionary { words: words, modules, cache: HashMap::new(), }) } } impl TryDerive<Item> for Definition { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { coll::List::try_derive(i).map(Definition::Derived) } } impl TryDerive<Item> for Entry { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let s = coll::Sized::try_derive(i)?; match s { coll::Sized::Associative(assoc::Associative::DictEntry(d)) => Ok(d), c => c.into_iter().try_fit(), } } } impl Derive<Entry> for assoc::Associative { fn derive(d: Entry) -> assoc::Associative { let mut assoc = assoc::Association::fresh(); let a = assoc.mutate(); d.examples.and_then(|l| a.insert("examples".fit(), l.fit())); d.spec.and_then(|l| a.insert("spec".fit(), l.fit())); if let Definition::Derived(d) = d.definition { a.insert("definition".fit(), d.fit()); } assoc::Associative::Assoc(assoc) } } impl TryDerive<Item> for Words { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let s = coll::Sized::try_derive(i)?; match s { coll::Sized::Associative(assoc::Associative::Words(d)) => Ok(d), c => c.into_iter().try_fit(), } } } impl TryDerive<Item> for Dictionary { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let s = assoc::Associative::try_derive(i)?; match s { assoc::Associative::Dictionary(d) => Ok(d), //assoc::Associative::Assoc(a) => a.into_iter().try_fit(), a => Err(Error::expected("dictionary", a)), } } } impl Derive<Entry> for Item { fn derive(e: Entry) -> Self { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::DictEntry(e), ))) } } impl Derive<Words> for Item { fn derive(e: Words) -> Self { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Words(e), ))) } } impl Derive<Dictionary> for Item { fn derive(d: Dictionary) -> Self { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Dictionary(d), ))) } } impl Derive<(Word, Entry)> for Item { fn derive((k, v): (Word, Entry)) -> Item { coll::List::derive_iter([Item::Word(k.fit()), Item::derive(v.clone())]).fit() } } impl TryDerive<Item> for (Word, Entry) { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let s = coll::Sized::try_derive(i)?; if s.count() != 2 { Err(Error::expected("pair", s)) } else { let mut iter = s.into_iter(); let key: types::Word = iter.next().unwrap().try_fit()?; let value: Entry = iter.next().unwrap().try_fit()?; Ok((key.fit(), value)) } } }
1.5.3.7. Environment types
//! Functionality of a kcats execution environment. use super::{associative as assoc, dictionary as dict}; use crate::axiom; use crate::list; use crate::serialize; use crate::traits::*; use crate::types::container::dictionary::Dict; use crate::types::container::{self as coll, Mutey, Ordered}; use crate::types::number::Number; use crate::types::*; use once_cell::sync::Lazy; use std::future; use std::ops::Range; /// A struct to hold the state of an executing kcats program. The /// `stack` is the data being manipulated, the `program` is program /// remaining to be executed, and the `dictionary` is the set of /// functions available to the program. #[derive(Clone, PartialEq)] pub struct Environment { pub stack: Stack, pub program: Stack, pub dictionary: dict::Dictionary, } impl Environment { /// Push the [Item] onto the top of the stack. pub fn push<T: Fit<Item>>(&mut self, i: T) { coll::Arc::mutate(&mut self.stack).push_front(i.fit()); } /// Pop the top [Item] from the stack, panicking if the stack is /// empty. pub fn pop(&mut self) -> Item { coll::Arc::mutate(&mut self.stack).pop_front().unwrap() } /// Pushes one item onto the front of the program (so that it /// executes first). pub fn push_prog(&mut self, i: Item) { coll::Arc::mutate(&mut self.program).push_front(i); } /// Pops an [Item] from the front of the program. pub fn pop_prog(&mut self) -> Item { coll::Arc::mutate(&mut self.program).pop_front().unwrap() } /// Returns a reference to the top stack [Item], or [None] if it's /// empty. pub fn tos(&self) -> Option<&Item> { self.stack.front() } /// Returns the length of this struct (as an associative /// structure), which is constant. pub fn len(&self) -> usize { 3 // 3 fields } /// Treats the [Environment] as an associative structure, /// returning one of its fields, or [None]. pub fn get(&self, key: &assoc::KeyItem) -> Option<Item> { match key { assoc::KeyItem::Word(w) => match w.data.as_str() { "stack" => Some(self.stack.clone().fit()), "program" => Some(self.program.clone().fit()), "dictionary" => Some(self.dictionary.clone().fit()), _ => None, }, _ => None, } } /// Returns true if the [Environment] contains the given key, /// which is only true for its fixed fields. pub fn contains_key(&self, key: &assoc::KeyItem) -> bool { Word::try_derive(key.clone()).map_or(false, |ref w| { matches!(w.fit(), "stack" | "program" | "dictionary") }) } /// Inserts the key/value into the [Environment]. If the key is /// not one of its fixed fields, return a demoted /// [assoc::Associative] value that's more generic and supports /// any key. Also optionally return any old value that might get /// overwritten. pub fn insert(mut self, k: assoc::KeyItem, v: Item) -> (assoc::Associative, Option<Item>) { let demote = |o: Environment, k: assoc::KeyItem, v: Item| { //println!("Demotion!!! {:?}", o); let mut a = assoc::Association::derive_iter(o); let am = a.mutate(); let old = am.insert(k, v); (assoc::Associative::Assoc(a), old) }; match k { assoc::KeyItem::Word(ref w) => { let w: &str = w.fit(); match w { "stack" => { let l = coll::List::try_derive(v.clone()); match l { Ok(l) => { let old = self.stack.clone(); self.stack = l; (assoc::Associative::Env(self), Some(old.fit())) } Err(_) => demote(self, k, v), } } "program" => { let l = coll::List::try_derive(v.clone()); match l { Ok(l) => { let old = self.program.clone(); self.program = l; (assoc::Associative::Env(self), Some(old.fit())) } Err(_) => demote(self, k, v), } } "dictionary" => { let d = dict::Dictionary::try_derive(v.clone()); match d { Ok(d) => { let old = self.dictionary.clone(); self.dictionary = d; (assoc::Associative::Env(self), Some(old.fit())) } Err(_) => demote(self, k, v), } } k => demote(self, k.fit(), v), } } _ => demote(self, k, v), } } /// Reads a stdlib module and updates the dictionary. pub fn load_builtin_module(&mut self, module_alias: Word) -> Result<(), Error> { self.push(module_alias); axiom::read_blob(self) } /// Loads the core modules as part of preparing a standard /// environment. fn load_core_modules(&mut self) -> Result<(), Error> { // Assuming /project/core/ is in your project's root directory and part of the source let files: Vec<&[u8]> = vec![ include_bytes!("../../kcats/core/stack-builtins.kcats"), include_bytes!("../../kcats/core/motion-builtins.kcats"), include_bytes!("../../kcats/core/compare-builtins.kcats"), include_bytes!("../../kcats/core/math-builtins.kcats"), include_bytes!("../../kcats/core/boolean-builtins.kcats"), include_bytes!("../../kcats/core/serialize-builtins.kcats"), include_bytes!("../../kcats/core/encode-builtins.kcats"), include_bytes!("../../kcats/core/strings-builtins.kcats"), include_bytes!("../../kcats/core/errors-builtins.kcats"), include_bytes!("../../kcats/core/pipes-builtins.kcats"), include_bytes!("../../kcats/core/stack.kcats"), include_bytes!("../../kcats/core/motion.kcats"), include_bytes!("../../kcats/core/collections-builtins.kcats"), include_bytes!("../../kcats/core/execute-builtins.kcats"), include_bytes!("../../kcats/core/execute.kcats"), include_bytes!("../../kcats/core/dictionary-builtins.kcats"), include_bytes!("../../kcats/core/math.kcats"), include_bytes!("../../kcats/core/compare.kcats"), include_bytes!("../../kcats/core/collections.kcats"), include_bytes!("../../kcats/core/associations-builtins.kcats"), include_bytes!("../../kcats/core/associations.kcats"), include_bytes!("../../kcats/core/dictionary.kcats"), include_bytes!("../../kcats/core/environment-builtins.kcats"), include_bytes!("../../kcats/core/environment.kcats"), include_bytes!("../../kcats/core/sets-builtins.kcats"), ]; for &file_contents in &files { let lexicon = String::from_utf8_lossy(file_contents).into_owned(); match self.dictionary.words.insert_core_module(lexicon.clone()) { Ok(_) => {} Err(mut e) => { e.push("content".fit(), lexicon.fit()); return Err(e); } } } Ok(()) } /// Returns an error if the stack isn't at least `min_depth` deep. fn check_stack_depth(&self, min_depth: usize) -> Result<(), Error> { //println!("Checking stack has at least {} items", min_depth); if self.stack.len() < min_depth { Err(Error::stack_underflow()) } else { Ok(()) } } /// Returns an error if the stack doesn't match the given input spec. pub fn check_input_spec(&self, specs: &dict::StackSpec) -> Result<(), Error> { self.check_stack_depth(specs.len())?; let indexes = Range { start: 0, end: specs.len(), }; indexes.into_iter().try_for_each(|i| { let item = self.stack.get(i).unwrap(); let spec = specs.get(i).unwrap(); check_type(item, &spec.elemtype) }) } pub fn is_finished(&self) -> bool { self.program.is_empty() } } /// A reducing function that loads modules one at a time. Takes an /// existing env, loads the given module and returns the new env with /// the dictionary that the module built on the stack. fn load_module(mut env: Environment, module: &&str) -> Environment { //println!("Loading module {}", *module); env.push(Word::derive(*module)); env.program.prepend(list!( "dictionary", "swap", list!( "decache", "string", "read", list!("words"), "swap", "update" ), "shielddown", "dropdown" )); //println!("Dictionary: {:?}", env.dictionary); let mut new_env = futures::executor::block_on(async move { axiom::eval(env).await }); //println!("Env: {:?}", new_env); let mut dict: dict::Dictionary = new_env.pop().try_fit().unwrap(); dict.resolve(); new_env.dictionary = dict; //println!("loaded module {}: {:?}", module, new_env.dictionary); new_env } impl Default for Environment { /// Returns the default environment, which is the "standard" /// environment. It loads some standard libraries and core /// functions. The environment is only built once and memoized. fn default() -> Self { static INST: Lazy<Environment> = Lazy::new(|| { //println!("Env::default"); let mut env = Environment { dictionary: dict::Dictionary { words: dict::Words::builtins(), modules: Default::default(), cache: HashMap::new(), }, stack: Default::default(), program: Default::default(), }; env.load_core_modules() .expect("failed to load core modules"); env.dictionary.resolve(); //println!("Dict has {} words", env.dictionary.entries.len()); let mut env = [ "errors", "encode", "time", // good candidate for lib "pipes", "methods", "generators", "debug", "more-generators", "database", //"template", //"debug", // END default namespace // another good candidate for lib ] .iter() .fold(env, load_module); // println!( // "After loading modules Dict has {} words", // env.dictionary.entries.len() // ); // need to do this again because we loaded some builtins // above, need to add the rust definitions back that got // overwritten. env.dictionary.resolve(); //print!("Env: {:?}", env); env }); INST.clone() } } /// Returns an error if the [Item] is not of the type specified by /// [Word] `w`. This allows specs to have their own little type /// hierarchy, eg, `integer` is a `number`, `list` is a `sized` etc. fn check_type(i: &Item, w: &Word) -> Result<(), Error> { //println!("Check {:?} is {:?}", w, i); match (w, i) { (w, _) if *w == *S_ITEM => Ok(()), (w, Item::Dispenser(_) | Item::Receptacle(_)) if *w == *S_DISPENSER => Ok(()), (w, Item::Receptacle(_) | Item::Dispenser(_)) if *w == *S_RECEPTACLE => Ok(()), (w, Item::Number(Number::Int(_))) if *w == *S_INTEGER || *w == *S_NUMBER => Ok(()), (w, Item::Number(Number::Float(_))) if *w == *S_FLOAT || *w == *S_NUMBER => Ok(()), // TODO: also handle cases where bytes/string is a list ( w, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(_))), ) if *w == *S_BYTES || *w == *S_ORDERED => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::String(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::String(_))), ) if *w == *S_STRING => Ok(()), (w, Item::Word(_)) if *w == *S_WORD => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Out(_)) | Item::Dispenser(coll::Dispenser::Tunnel(_)) | Item::Receptacle(coll::Receptacle::Tunnel(_)) | Item::Receptacle(coll::Receptacle::In(_)), ) if *w == *S_PIPE => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::List(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::List(_))), ) if *w == *S_LIST || *w == *S_PROGRAM => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative(_))), ) if *w == *S_ASSOC => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Error(_), ))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative( assoc::Associative::Error(_), ))), ) if *w == *S_ERROR => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Dictionary(_), ))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative( assoc::Associative::Dictionary(_), ))), ) if *w == *S_DICTIONARY => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Sized(_)) | Item::Receptacle(coll::Receptacle::Sized(_)), ) if *w == *S_SIZED || *w == *S_ORDERED => Ok(()), ( w, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Env(_), ))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative( assoc::Associative::Env(_), ))), ) if *w == *S_ENVIRONMENT => Ok(()), (w, i) => { //println!("Type check failed! wanted {} got {:?}", w.data, i); let expected = format!("{}?", w.data); Err(Error::expected(expected.as_str(), i.clone())) } } } impl TryDerive<Box<dyn Iterator<Item = Item>>> for Environment { type Error = Error; fn try_derive(iter: Box<dyn Iterator<Item = Item>>) -> Result<Self, Error> { let mut stack: Option<coll::List> = None; let mut program: Option<coll::List> = None; let mut dictionary: Option<dict::Dictionary> = None; for i in iter { let (k, v): (assoc::KeyItem, Item) = i.try_fit()?; if k == "stack".fit() { stack = Some(v.try_fit()?) } else if k == "program".fit() { program = Some(v.try_fit()?) } else if k == "dictionary".fit() { let mut d = dict::Dictionary::try_derive(v)?; d.resolve(); dictionary = Some(d); } else { continue; } } let env = Environment { stack: stack.unwrap_or_default(), program: program.unwrap_or_default(), dictionary: dictionary.unwrap_or_else(|| Environment::default().dictionary), }; Ok(env) } } impl TryDerive<Item> for Environment { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { //println!("Convert to env: {:?}", i); let s = coll::Sized::try_derive(i)?; match s { coll::Sized::Associative(assoc::Associative::Env(e)) => Ok(e), l => l.into_iter().try_fit(), } } } impl Derive<Environment> for Item { fn derive(env: Environment) -> Item { assoc::Associative::Env(env).fit() } } impl Derive<Environment> for Future<Environment> { fn derive(env: Environment) -> Future<Environment> { Box::pin(future::ready(env)) } } impl IntoIterator for Environment { type Item = assoc::Entry; type IntoIter = Box<dyn Iterator<Item = assoc::Entry>>; fn into_iter(self) -> Self::IntoIter { let v: Vec<(assoc::KeyItem, Item)> = vec![ ("stack".fit(), self.stack.fit()), ("program".fit(), self.program.fit()), ("dictionary".fit(), self.dictionary.fit()), ]; Box::new(v.into_iter()) } } impl serialize::Display for Environment { fn representation(&self) -> Item { let assoc = assoc::Association::derive_iter(self.clone()); //let am = assoc.mutate(); //am.remove(&("dictionary".fit())); assoc.fit() } }
1.5.3.8. Hash-based object cache
We will need to be able to fetch binary data via its hash, or a local name. This will supplement the database, by storing larger objects directly in the filesystem, which is more efficient. But it can also be used by the kcats module system even if the optional database feature is disabled. When the database is enabled, hashes or local aliases can be stored in the db as values and then once those are retrieved, the actual contents can be retrieved via the cache. Currently there is no disk space management, so the cache grows unbounded.
We also want to pre-populate some cache items at build time (some of the standard libraries), so we make the cache a separate crate so we can use it at both build time and runtime.
[package] name = "cache" version = "0.1.0" edition = "2021" [dependencies] base64 = "0.13.0" sha2 = {version="0.10.6", features=["std"]} directories = "5.0" [dependencies.uuid] version = "1.6.1" features = [ "v4", # Lets you generate random UUIDs "v7", "fast-rng", # Use a faster (but still sufficiently random) RNG ]
pub mod cache { use base64::URL_SAFE_NO_PAD; use io::Error; use sha2::Sha256; use sha2::{self, Digest}; use std::fs::{self, File}; use std::io::{self, Read, Write}; use std::io::{BufReader, BufWriter}; #[cfg(windows)] use std::os::windows::fs::{symlink_dir, symlink_file}; // For Windows use std::path::{Path, PathBuf}; use uuid::Uuid; type Bytes = Vec<u8>; pub enum Key { Hash(Bytes), Alias(String), } #[derive(Clone, PartialEq, Debug)] pub struct Cache { path: PathBuf, } impl Cache { pub fn new(path: PathBuf) -> Result<Self, Error> { if path.exists() && path.is_dir() { fs::create_dir_all(&path)?; Ok(Cache { path: path }) } else { Err(Error::new(io::ErrorKind::NotFound, "Cache dir not found")) } } fn path(&self, key: &Key) -> PathBuf { // Look up the file let filename = match key { Key::Hash(hash) => base64::encode_config(hash, base64::URL_SAFE_NO_PAD), Key::Alias(word) => word.to_string(), }; let file = self.path.join(filename); file } pub fn get(&self, key: &Key) -> Result<Bytes, Error> { let mut content = Bytes::new(); fetch_link(self.path(key).as_path()).and_then(|mut f| f.read_to_end(&mut content))?; Ok(content) } pub fn deref(&self, key: &Key) -> Result<Bytes, io::Error> { match key { Key::Hash(h) => Ok(h.clone()), Key::Alias(a) => self .target(a.clone()) .and_then(|t| Ok(base64::decode_config(t, base64::URL_SAFE_NO_PAD).unwrap())), } } fn target(&self, alias: String) -> Result<String, io::Error> { let path = self.path(&Key::Alias(alias)); match get_link_type() { Link::Symlink => fs::read_link(path) .map(|p| p.file_name().unwrap().to_str().unwrap().to_string()), Link::Manual => { let mut link_file = File::open(path)?; let mut target_path = String::new(); link_file.read_to_string(&mut target_path)?; Ok(target_path) } } } pub fn put(&self, content: &Bytes, alias: Option<String>) -> Result<Bytes, Error> { let hash = sha2::Sha256::digest(&content); let hashfilename = base64::encode_config(hash, base64::URL_SAFE_NO_PAD); let target = self.path.join(hashfilename.clone()); // only write the file if it doesn't exist if !target.exists() { std::fs::write(target.clone(), content)?; } // Create the alias if necessary if let Some(alias) = alias { let alias_path = self.path.join(PathBuf::from(alias.to_string())); create_link(Path::new(&hashfilename), alias_path.as_path())?; } Ok(hash.to_vec()) } pub fn put_from_path(&self, path: &Path, alias: Option<String>) -> Result<Bytes, Error> { let unique_temp_file = format!("temp_{}", Uuid::new_v4()); // Generate a unique temporary filename let temp_file_path = self.path.join(unique_temp_file); let file = File::open(path)?; let mut reader = BufReader::new(file); let mut hasher = Sha256::new(); let mut buffer = [0u8; 4096]; // 4 KiB buffer let mut writer = BufWriter::new(File::create(&temp_file_path)?); // Read, hash, and write in chunks loop { let count = reader.read(&mut buffer)?; if count == 0 { break; } hasher.update(&buffer[..count]); writer.write_all(&buffer[..count])?; } writer.flush()?; // Compute the final hash and rename the file according to the hash let hash = hasher.finalize(); let file_name = base64::encode_config(&hash, URL_SAFE_NO_PAD); let final_path = self.path.join(file_name.clone()); // Move to final destination, overwriting anything there fs::rename(temp_file_path.clone(), final_path.clone())?; // Create the alias if necessary if let Some(alias) = alias { let alias_path = self.path.join(alias); let target_path = Path::new(".").join(file_name); create_link(&target_path, &alias_path)?; } Ok(hash.to_vec()) } } enum Link { Manual, Symlink, } /// Determines the appropriate link type based on the operating system. fn get_link_type() -> Link { if cfg!(target_os = "linux") || cfg!(target_os = "macos") || cfg!(target_os = "windows") || cfg!(target_os = "android") { Link::Symlink } else if cfg!(target_os = "ios") { Link::Manual } else { panic!("Unsupported operating system for linking"); } } /// Creates a link based on the specified link type. /// /// # Arguments /// * `target` - The target file or directory to link to. /// * `link_name` - The name of the symlink or link file to create. /// /// # Returns /// A `Result` indicating success or failure. fn create_link(target: &Path, link_name: &Path) -> io::Result<()> { //println!("Creating link from {:?} to {:?}", link_name, target); if link_name.exists() { fs::remove_file(link_name)?; } match get_link_type() { Link::Symlink => { #[cfg(unix)] std::os::unix::fs::symlink(target, link_name)?; #[cfg(windows)] { if target.is_dir() { std::os::windows::fs::symlink_dir(target, link_name)? } else { std::os::windows::fs::symlink_file(target, link_name)? } } } Link::Manual => { let mut link_file = File::create(link_name)?; let path = target.canonicalize()?; let target_path = path.to_str().ok_or(Error::new( io::ErrorKind::Other, "Failed to convert path to string", ))?; writeln!(link_file, "{}", target_path)?; } } Ok(()) } /// Fetches the target of a link. /// /// # Arguments /// * `link_name` - The link file or symlink to fetch. /// /// # Returns /// A `Result` containing the target `File`. fn fetch_link(link_name: &Path) -> io::Result<File> { match get_link_type() { Link::Symlink => File::open(link_name), Link::Manual => { let mut link_file = File::open(link_name)?; let mut target_path = String::new(); link_file.read_to_string(&mut target_path)?; File::open(target_path) } } } /// Deletes a link. /// /// # Arguments /// * `link_name` - The link file or symlink to delete. /// /// # Returns /// A `Result` indicating success or failure. #[allow(dead_code)] fn delete_link(link_name: &Path) -> io::Result<()> { fs::remove_file(link_name) } }
1.5.3.9. Cryptographic primitives
We'll implement certain cryptography functions in rust and make kcats words for them (hashing, encryption, signing)
use crate::axiom::ItemResult; use crate::traits::*; use crate::types::container::{associative as assoc, error::Error}; use crate::types::number::Int; use crate::types::{Bytes, Item}; use core::ops::Deref; use ed25519_dalek as signing; use ed25519_dalek::{Signer, Verifier}; use rand::rngs::OsRng; // Import OsRng use rand::RngCore as RandRngCore; use rand_core::{CryptoRng, RngCore}; use sha2::{self, Digest}; // Import RngCore for the fill_bytes method pub fn hash(i: Item) -> ItemResult { let b = Bytes::try_derive(i).unwrap(); Ok(sha2::Sha256::digest(&b).deref().to_vec().fit()) } type Value = Vec<u8>; pub struct SeededRNG { seed: Value, salt: Value, } impl SeededRNG { // Hash of seed|value fn hash(&self) -> Vec<u8> { let mut v = self.seed.clone(); v.extend(self.salt.clone()); sha2::Sha256::digest(v.as_slice()).deref().to_vec() } } impl RngCore for SeededRNG { fn next_u32(&mut self) -> u32 { rand_core::impls::next_u32_via_fill(self) } fn next_u64(&mut self) -> u64 { rand_core::impls::next_u64_via_fill(self) } fn fill_bytes(&mut self, dest: &mut [u8]) { let l = dest.len(); dest.copy_from_slice(&self.hash()[..l]); } fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), rand_core::Error> { self.fill_bytes(dest); Ok(()) } } #[allow(dead_code)] pub fn hash_bytes(contents: &[u8]) -> Vec<u8> { let mut hasher = sha2::Sha256::new(); //let mut buffer = [0; 1024]; // Read in chunks of 1024 bytes let count = contents.len(); hasher.update(&contents[..count]); hasher.finalize().to_vec() } impl CryptoRng for SeededRNG {} pub fn key(seed: Item) -> ItemResult { let sbs: Bytes = seed.try_fit()?; let kp = signing::Keypair::generate(&mut SeededRNG { seed: vec![], salt: sbs, }); Ok(assoc::Association::derive_iter([ ("type".fit(), "elliptic-curve-key".fit()), ("secret".fit(), kp.secret.as_ref().to_vec().fit()), ("public".fit(), kp.public.as_ref().to_vec().fit()), ]) .fit()) } impl TryDerive<Item> for signing::Keypair { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let sk: signing::SecretKey = i.try_fit()?; let pk: signing::PublicKey = (&sk).into(); Ok(signing::Keypair { secret: sk, public: pk, }) } } impl TryDerive<Item> for signing::SecretKey { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let a = assoc::Associative::try_derive(i)?; if a.get(&"type".fit()) == Some("elliptic-curve-key".fit()) { let sk = signing::SecretKey::from_bytes( &Bytes::try_derive( a.get(&"secret".fit()) .ok_or_else(|| Error::expected("secret", None::<Item>))?, )?[..], ) .map_err(|_e| Error::expected("valid-secret-key", None::<Item>))?; Ok(sk) } else { Err(Error::expected("keypair", a)) } } } impl TryDerive<Item> for signing::PublicKey { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { let a = assoc::Associative::try_derive(i)?; if a.get(&"type".fit()) == Some("elliptic-curve-key".fit()) { let pk = signing::PublicKey::from_bytes( &Bytes::try_derive( a.get(&"public".fit()) .ok_or_else(|| Error::expected("public", None::<Item>))?, )?[..], ) .map_err(|_e| Error::expected("valid-public-key", None::<Item>))?; Ok(pk) } else { Err(Error::expected("public-key", a)) } } } //TODO: we can only call sign from a keypair, so we may want to assume // that we have either the kp, or just the secret key. pub fn sign(k: Item, m: Item) -> ItemResult { let kp: signing::Keypair = k.try_fit()?; let message: Bytes = m.try_fit()?; let signature: signing::Signature = kp.sign(&message); Ok(signature.as_ref().to_vec().fit()) } pub fn verify(k: Item, m: Item, s: Item) -> ItemResult { let mret = m.clone(); let pk: signing::PublicKey = k.try_fit()?; let mbs: Bytes = m.try_fit()?; let sbs: Bytes = s.try_fit()?; let sig = signing::Signature::from_bytes(&sbs) .map_err(|_e| Error::expected("signature", None::<Item>))?; Ok(pk.verify(&mbs, &sig).map(|_| mret).unwrap_or_default()) } fn random_bytes(n: usize) -> Vec<u8> { let mut bytes = vec![0u8; n]; // Create a vector of n zeros OsRng.fill_bytes(&mut bytes); // Fill the vector with random bytes bytes } pub fn random(n: Item) -> ItemResult { let n: Int = n.try_fit()?; Ok(random_bytes(n as usize).fit()) }
1.5.4. Serialization
We'll define how kcats data structure are parsed and written (for example, in order to read/write to/from disk).
//! Serializes and parses kcats data. kcats serialization is inspired //! by Joy, and implemented as a subset of edn (where only vector //! containers from edn are used, no lists, maps or sets). Currently //! one custom tag is used for encoding byte arrays, but this is //! subject to change. use crate::traits::*; use crate::types::container::{ self as coll, associative as assoc, environment::Environment, error::Error, }; use crate::types::number::Number; use crate::types::*; use base64::prelude::BASE64_URL_SAFE_NO_PAD; use base64::Engine; use std::collections::VecDeque; use std::fmt; use std::string; use std::sync::Arc; pub trait Display { fn representation(&self) -> Item; } const BYTE_TAG: &str = "b64"; /// Parses a serialized value into an [Item]. fn to_item(item: &edn_format::Value) -> Result<Item, Error> { //println!("to item {:?}", item); match item { edn_format::Value::Integer(i) => Ok(Item::Number(Number::Int(*i))), edn_format::Value::Vector(v) => Ok({ coll::List::derive_iter( v.iter() .map(to_item) .collect::<Result<VecDeque<Item>, Error>>()?, ) .fit() }), edn_format::Value::Symbol(s) => Ok(Item::Word(s.to_string().fit())), // we don't have booleans in kcats, so if we see 'false' that // is the word false which is not defined in the base // language, but might be user-defined later. edn_format::Value::Boolean(b) => Ok(if *b { "yes".fit() } else { "false".fit() }), edn_format::Value::String(s) => Ok(s.to_string().fit()), edn_format::Value::Float(f) => Ok(Item::Number(Number::Float(f.into_inner()))), edn_format::Value::TaggedElement(tag, e) => { if *tag == edn_format::Symbol::from_name(BYTE_TAG) { if let edn_format::Value::String(s) = &**e { Ok(BASE64_URL_SAFE_NO_PAD .decode(s.clone().into_bytes()) .unwrap() .fit()) } else { Err(Error::parse("Invalid tag datatype for byte literal")) } } else { Err(Error::parse("Unsupported tag")) } } edn_format::Value::Character(c) => Ok(Item::Char(*c)), _ => Err(Error::parse("Unsupported data literal")), } } fn from_sized(s: &coll::Sized) -> edn_format::Value { match s { coll::Sized::Associative(assoc::Associative::Words(e)) => edn_format::Value::Symbol( edn_format::Symbol::from_name(format!("{}_entries", e.len()).as_str()), ), coll::Sized::Associative(assoc::Associative::Env(e)) => (&e.representation()).into(), coll::Sized::String(s) => edn_format::Value::String(s.to_string()), coll::Sized::Bytes(bs) => edn_format::Value::TaggedElement( edn_format::Symbol::from_name("b64"), Box::new(edn_format::Value::String(BASE64_URL_SAFE_NO_PAD.encode(bs))), ), coll::Sized::Associative(a) => { let mut av = a.clone().to_iter().collect::<Vec<(assoc::KeyItem, Item)>>(); av.sort_by(|(ka, _), (kb, _)| ka.cmp(kb)); edn_format::Value::Vector( av.into_iter() .map(|i| (&Item::derive(i)).into()) .collect::<Vec<edn_format::Value>>(), ) } coll::Sized::Set(s) => { let mut v = s.iter().cloned().collect::<Vec<assoc::KeyItem>>(); v.sort(); edn_format::Value::Vector(v.into_iter().map(|ki| (&Item::derive(ki)).into()).collect()) } v => edn_format::Value::Vector( v.clone() .into_iter() .map(|i| (&i).into()) .collect::<Vec<edn_format::Value>>(), ), } } /// Serializes the item deterministically. Certain data is lost in /// serialization, including the type of container (sets/maps/lists /// all are serialized as vectors) impl<'a> From<&'a Item> for edn_format::Value { fn from(item: &Item) -> Self { match item { // dictionaries are big and it's ugly to print them for // environments. Item::Number(Number::Int(i)) => edn_format::Value::Integer(*i), Item::Number(Number::Float(f)) => edn_format::Value::from(*f), Item::Char(c) => edn_format::Value::Character(*c), Item::Word(w) => edn_format::Value::Symbol(edn_format::Symbol::from_name(w.fit())), //Item::Entry(w) => edn_format::Value::Symbol(edn_format::Symbol::from_name(&w.word)), Item::Dispenser(coll::Dispenser::Out(o)) => (&o.representation()).into(), Item::Dispenser(coll::Dispenser::Tunnel(t)) => (&t.representation()).into(), Item::Receptacle(coll::Receptacle::In(i)) => (&i.representation()).into(), Item::Receptacle(coll::Receptacle::Tunnel(t)) => (&t.representation()).into(), Item::Dispenser(coll::Dispenser::Sized(s)) => from_sized(s), Item::Receptacle(coll::Receptacle::Sized(s)) => from_sized(s), } } } pub fn parse(s: String) -> Result<coll::List, Error> { let parser = edn_format::Parser::from_iter(s.chars(), edn_format::ParserOptions::default()); Ok(coll::List::derive_iter( parser .map(move |r| match r { Ok(expr) => Ok(to_item(&expr)?), Err(e) => Err(Error::from(e)), }) .collect::<Result<Vec<Item>, Error>>()?, )) } pub trait Emit { fn emit(self) -> String; } impl<'a> Emit for &'a Item { fn emit(self) -> String { edn_format::emit_str(&(self).into()) } } impl<I, T> Emit for I where I: Iterator<Item = T>, T: Emit, { fn emit(self) -> String { let mut s: String = String::new(); for i in self { s.push_str(i.emit().as_str()); s.push(' '); } s.pop(); s.to_string() } } // print out envs in error messages impl fmt::Debug for Environment { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!( f, "{{ stack: {}, program: {} }}", (&Item::derive(self.stack.clone())).emit(), (&Item::derive(self.program.clone())).emit(), ) } } impl fmt::Debug for Error { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{}", (&Item::derive(self.data.clone())).emit()) } } impl From<edn_format::ParserError> for Error { fn from(e: edn_format::ParserError) -> Self { Error::parse(e.to_string().as_str()) } } impl From<string::FromUtf8Error> for Error { fn from(e: string::FromUtf8Error) -> Self { Error::parse(e.to_string().as_str()) } } fn insert_line_breaks(input: &str, max_items: usize, max_chars: usize) -> String { let mut result = String::new(); let mut current_line_length = 0; let mut open_list_stack: Vec<(usize, usize)> = Vec::new(); open_list_stack.push((0, 0)); let mut last_char: char = '\n'; let mut in_string: bool = false; let mut in_tag = false; let mut chars = input.chars().peekable(); // Convert to a Peekable iterator while let Some(c) = chars.next() { current_line_length += 1; match c { '"' => { if last_char != '\\' { in_string = !in_string; } result.push(c); } '[' => { if !in_string && last_char != '\\' { open_list_stack.push((0, 0)); // Start a new list } result.push(c); } ']' => { result.push(c); if !in_string && last_char != '\\' { let (last_count, break_count) = open_list_stack.pop().unwrap(); //println!("items, breaks: {}, {}", last_count, break_count); if (last_count == 1 || last_count >= 6 || break_count > 0) && chars.peek() != Some(&']') { // Only add a newline if the next character is not a closing bracket result.push('\n'); let (_, break_count) = open_list_stack.last_mut().unwrap(); *break_count += 1; current_line_length = 0; } } } ' ' => { if !in_string { let (last_count, break_count) = open_list_stack.last_mut().unwrap(); if in_tag { in_tag = false; } else { *last_count += 1; } if (*last_count > 0 && (*last_count % max_items) == 0) || current_line_length > max_chars { result.push('\n'); *break_count += 1; current_line_length = 0; //*last_count = 0; } } result.push(c); } '#' => { if !in_string { in_tag = true; } result.push(c); } _ => { result.push(c); } } last_char = c; } if result.ends_with('\n') { result.pop(); } //println!("broken output: {:?}", result); result } fn parse_indent(stack: &mut Vec<usize>, input: &str) { let mut in_string = false; let mut escaped = false; for (idx, c) in input.chars().enumerate() { if in_string { match c { '"' if !escaped => in_string = false, // TODO handle \\ (escaped backslash char) '\\' if !escaped => escaped = true, _ => escaped = false, } } else { match c { '[' if !escaped => { escaped = false; stack.push(idx); } ']' if !escaped => { escaped = false; stack.pop(); } '"' => { escaped = false; in_string = true; } ';' => { break; } '\\' => { escaped = true; } _ => { escaped = false; } } } } } fn format_indentation(input: &str) -> String { let mut result = String::new(); let mut indentations = Vec::<usize>::new(); for line in input.lines() { let trimmed = line.trim(); // Deduce the new indentation based on the last item in the indentations stack let new_indent = indentations.last().copied().map(|x| x + 1).unwrap_or(0); let padded_line = format!("{}{}\n", " ".repeat(new_indent), trimmed); result.push_str(padded_line.as_str()); parse_indent(&mut indentations, &padded_line); //println!("indentations: {:?}: {:?}", padded_line, indentations); } result.pop(); // Remove the last newline result } pub fn auto_format(input: &str, max_items: usize, max_chars: usize) -> String { let with_breaks = insert_line_breaks(input, max_items, max_chars); format_indentation(&with_breaks) } /// a function that takes an env, and an input string. Parses the /// string, if it parses, returns the env with the input added to the /// program. Otherwise returns Error. pub fn parse_input(env: &mut Environment, input: String) -> Result<(), Error> { let mut parsed = parse(input)?; let expr = Arc::make_mut(&mut env.program); expr.extend(Arc::make_mut(&mut parsed).drain(..)); Ok(()) } #[cfg(test)] mod tests { use super::*; #[test] fn test_insert_line_breaks() { let input = "[[foo bar][baz [[quux floop][toop poop]]]]"; let expected = "[[foo bar]\n[baz [[quux floop]\n[toop poop]]]]"; let output = insert_line_breaks(input, 6, 80); assert_eq!(output, expected); let input = "[[[1 2 3] b][c d]]"; let expected = "[[[1 2 3] b]\n[c d]]"; let output = insert_line_breaks(input, 6, 80); assert_eq!(output, expected); // multiline list let input = "[[a b] [c d]] 5"; let expected = "[[a b]\n [c d]]\n 5"; let output = insert_line_breaks(input, 6, 80); assert_eq!(output, expected); } #[test] fn test_indentation() { let input = "[[foo bar]\n[baz [[quux floop]\n[toop poop]]]]"; let expected = "[[foo bar]\n [baz [[quux floop]\n [toop poop]]]]"; let output = format_indentation(input); assert_eq!(output, expected); let input = "\"hello\" [[a b]\n[c d]]"; let expected = "\"hello\" [[a b]\n [c d]]"; let output = format_indentation(input); assert_eq!(output, expected); } }
1.5.5. Builtin words
We'll define some words as axioms (not in terms of other words, only defined in Rust).
//! All the core functions of kcats: Words that are implemented in //! rust, instead of in terms of other kcats words. use crate::list; use crate::serialize::{self, Emit}; use crate::traits::*; use crate::config; #[cfg(feature = "database")] use crate::types::container::pipe::db; use crate::types::container::{ self as coll, associative as assoc, dictionary as dict, environment::Environment, error::Error, pipe, Container, Count, Join, Mutey, Ordered, Take, }; use crate::types::number::{Float, Int, Number}; use crate::types::*; use cache::cache; use dynfmt::{Format, SimpleCurlyFormat}; use futures::future::FutureExt; use lazy_static::lazy_static; use std::cmp::max; use std::collections::HashMap; use std::collections::VecDeque; use std::convert::Infallible; use std::default::Default; use std::mem; use std::sync::Arc; //#[cfg(feature = "httpclient")] //use surf; pub type ItemResult = Result<Item, Error>; /// Convert results into Items, for use when we intend to put the /// result on the stack whether it's an [Error] or some other [Item]. impl<T, U> Derive<Result<T, U>> for Item where T: Fit<Item>, U: Fit<Item>, { fn derive(i: Result<T, U>) -> Self { match i { Ok(i) => i.fit(), Err(e) => e.fit(), } } } /// A higher order function that executes a simpler function `f`, /// where `f` takes a stack item and returns a [Result] of another /// stack item. fn f_stack1<T, U, R>(f: fn(U) -> Result<T, R>) -> impl Fn(Environment) -> Future<Environment> where T: Fit<Item>, U: TryDerive<Item>, U::Error: Fit<Error>, R: Fit<Error>, { move |mut env: Environment| { let x = env .tos() .ok_or_else(Error::stack_underflow) .and_then(|x| U::try_derive(x.clone()).map_err(Fit::fit)); match x { Ok(x) => { let res = f(x); match res { Ok(r) => { env.pop_prog(); env.pop(); env.push(r); } Err(e) => { let err: Error = e.fit(); env.push(err); } } } Err(e) => { env.push(e); } } env.fit() } } /// A higher order function that executes a simpler function `f`, /// where `f` takes two stack items and returns a [Result] of another /// stack item. fn f_stack2<T, U, V, R>(f: fn(U, V) -> Result<T, R>) -> impl Fn(Environment) -> Future<Environment> where T: Fit<Item>, U: TryDerive<Item>, U::Error: Fit<Error>, V: TryDerive<Item>, V::Error: Fit<Error>, R: Fit<Error>, { move |mut env: Environment| { let x = env .tos() .ok_or_else(Error::stack_underflow) .and_then(|x| V::try_derive(x.clone()).map_err(Fit::fit)); let y = env .stack .get(1) .ok_or_else(Error::stack_underflow) .and_then(|y| U::try_derive(y.clone()).map_err(Fit::fit)); match (x, y) { (Ok(x), Ok(y)) => { let res = f(y, x); match res { Ok(r) => { env.pop_prog(); env.pop(); env.pop(); env.push(r); } Err(e) => { env.push(e.fit()); } } } (Err(e), _) => { env.push(e); } (_, Err(e)) => { env.push(e); } } env.fit() } } /// A higher order function that executes a simpler function `f`, /// where `f` takes 3 stack items and returns a [Result] of another /// stack item. fn f_stack3<T, U, V, W, R>( f: fn(U, V, W) -> Result<T, R>, ) -> impl Fn(Environment) -> Future<Environment> where T: Fit<Item>, U: TryDerive<Item>, U::Error: Fit<Error>, V: TryDerive<Item>, V::Error: Fit<Error>, W: TryDerive<Item>, W::Error: Fit<Error>, R: Fit<Error>, { move |mut env: Environment| { let x = env .tos() .ok_or_else(Error::stack_underflow) .and_then(|x| W::try_derive(x.clone()).map_err(Fit::fit)); let y = env .stack .get(1) .ok_or_else(Error::stack_underflow) .and_then(|y| V::try_derive(y.clone()).map_err(Fit::fit)); let z = env .stack .get(2) .ok_or_else(Error::stack_underflow) .and_then(|z| U::try_derive(z.clone()).map_err(Fit::fit)); match (x, y, z) { (Ok(x), Ok(y), Ok(z)) => { let res = f(z, y, x); match res { Ok(r) => { env.pop_prog(); env.pop(); env.pop(); env.pop(); env.push(r); } Err(e) => { env.push(e.fit()); } } } (Err(e), _, _) => { env.push(e); } (_, Err(e), _) => { env.push(e); } (_, _, Err(e)) => { env.push(e); } } env.fit() } } fn f_stack2_async( f: fn(Item, Item) -> Future<ItemResult>, ) -> impl Fn(Environment) -> Future<Environment> { move |mut env: Environment| { let x = env.pop(); let y = env.pop(); Box::pin(f(x, y).map(|r| { if r.is_ok() { env.pop_prog(); } env.push(r); env })) } } /// Wrapper function that allows you to use the ? operator in your own /// functions. If that function returns an error result, it will /// append that error to the env. The function `f` should return /// either unit or an Error. If it returns an [Error] it will be /// pushed onto the stack. fn f_result<F>(f: F) -> impl Fn(Environment) -> Future<Environment> where F: Fn(&mut Environment) -> Result<(), Error>, { move |mut env: Environment| { let r = f(&mut env); match r { Ok(_) => env.fit(), Err(e) => { env.push(e); env.fit() } } } } lazy_static! { pub static ref BUILTIN_FUNCTIONS: HashMap<Word, dict::Definition> = { #[cfg(not(feature = "database"))] let entries: Vec<(&str, &'static StepFn)>; #[cfg(feature = "database")] let mut entries: Vec<(&str, &'static StepFn)>; entries = vec![ ("*", Box::leak(Box::new(f_stack2(mult)))), ("+", Box::leak(Box::new(f_stack2(plus)))), ("get", Box::leak(Box::new(f_stack2(lookup)))), ("sort-indexed", Box::leak(Box::new(f_stack1(sort_by_key)))), ("-", Box::leak(Box::new(f_stack2(minus)))), ("/", Box::leak(Box::new(f_stack2(div)))), ("<", Box::leak(Box::new(f_stack2(lt)))), ("<=", Box::leak(Box::new(f_stack2(lte)))), ("=", Box::leak(Box::new(eq))), (">", Box::leak(Box::new(f_stack2(gt)))), (">=", Box::leak(Box::new(f_stack2(gte)))), ("abs", Box::leak(Box::new(f_stack1(abs)))), ("and", Box::leak(Box::new(f_stack2(and)))), ("animate", Box::leak(Box::new(animate))), ("assign", Box::leak(Box::new(f_stack3(assign)))), ( "association", Box::leak(Box::new(f_stack1::<assoc::Associative, Item, Error>( assoc::Associative::try_derive, ))), ), ( "association?", Box::leak(Box::new(f_stack1(is_association))), ), ( "attend", Box::leak(Box::new(f_stack1(crate::types::container::pipe::channel::select))), ), ("autoformat", Box::leak(Box::new(f_stack1(autoformat)))), ("branch", Box::leak(Box::new(branch))), ("bytes?", Box::leak(Box::new(f_stack1(is_bytes)))), ("cache", Box::leak(Box::new(f_result(write_blob)))), ("clone", Box::leak(Box::new(clone))), ("contains?", Box::leak(Box::new(f_stack2(contains)))), ("ceiling", Box::leak(Box::new(f_stack1(ceiling)))), ("compare", Box::leak(Box::new(f_stack2(compare)))), ("count", Box::leak(Box::new(f_stack1(count)))), ("dec", Box::leak(Box::new(f_stack1(dec)))), ("decache", Box::leak(Box::new(f_result(read_blob)))), ("decide", Box::leak(Box::new(decide))), ("decodejson", Box::leak(Box::new(f_stack1(decode_json)))), ("dictmerge", Box::leak(Box::new(f_result(dictmerge)))), ("dip", Box::leak(Box::new(dip))), ("dictionary", Box::leak(Box::new(dictionary))), ("dipdown", Box::leak(Box::new(dipdown))), ("drop", Box::leak(Box::new(drop))), ("emit", Box::leak(Box::new(f_stack1(emit)))), ("empty", Box::leak(Box::new(f_stack1(empty)))), ("empty?", Box::leak(Box::new(f_stack1(is_empty)))), ("encodestring", Box::leak(Box::new(f_stack1(encode_string)))), ("encodenumber", Box::leak(Box::new(f_stack1(encode_number)))), ("encodejson", Box::leak(Box::new(f_stack1(encode_json)))), ( "environment", Box::leak(Box::new(f_stack1::<Environment, Item, Error>( Environment::try_derive, ))), ), ("environment?", Box::leak(Box::new(f_stack1(is_environment)))), ("error?", Box::leak(Box::new(f_stack1(is_error)))), ("eval-step", Box::leak(Box::new(eval_step_outer))), ("evaluate", Box::leak(Box::new(evaluate))), ("even?", Box::leak(Box::new(f_stack1(is_even)))), ("evert", Box::leak(Box::new(evert))), ("execute", Box::leak(Box::new(execute))), ("exp", Box::leak(Box::new(f_stack2(exp)))), ("fail", Box::leak(Box::new(f_result(fail)))), ( "file-in", Box::leak(Box::new(f_stack1(crate::types::container::pipe::fs::file_in))), ), ( "file-out", Box::leak(Box::new(f_stack1(crate::types::container::pipe::fs::file_out))), ), ("finished?", Box::leak(Box::new(f_stack1(is_finished)))), //("first", Box::leak(Box::new(f_stack1(first)))), ("float", Box::leak(Box::new(float))), ("floor", Box::leak(Box::new(f_stack1(floor)))), ("format", Box::leak(Box::new(f_stack2(format)))), ("handle", Box::leak(Box::new(handle))), ( "handoff", Box::leak(Box::new(crate::types::container::pipe::channel::handoff)), ), ( "hashbytes", Box::leak(Box::new(f_stack1(crate::crypto::hash))), ), ("inc", Box::leak(Box::new(f_stack1(inc)))), ("intersection", Box::leak(Box::new(f_stack2(intersection)))), ("inspect", Box::leak(Box::new(f_stack1(inspect)))), ("join", Box::leak(Box::new(f_stack2(join)))), ("key", Box::leak(Box::new(f_stack1(crate::crypto::key)))), ("last", Box::leak(Box::new(f_stack1(last)))), ("list?", Box::leak(Box::new(f_stack1(is_list)))), ("log", Box::leak(Box::new(f_stack2(log)))), ("loop", Box::leak(Box::new(loop_))), ("mod", Box::leak(Box::new(f_stack2(mod_)))), ("not", Box::leak(Box::new(f_stack1(not)))), ("number", Box::leak(Box::new(f_stack1::<Number, Item, Error>( Number::try_derive, )))), ("number?", Box::leak(Box::new(f_stack1(is_number)))), ("odd?", Box::leak(Box::new(f_stack1(is_odd)))), ("or", Box::leak(Box::new(f_stack2(or)))), ("pack", Box::leak(Box::new(f_result(pack)))), ("pop", Box::leak(Box::new(pop))), ("put", Box::leak(Box::new(put))), ("pipe?", Box::leak(Box::new(f_stack1(is_pipe)))), ( "random", Box::leak(Box::new(f_stack1(crate::crypto::random))), ), ("range", Box::leak(Box::new(f_stack3(range)))), ("read", Box::leak(Box::new(f_stack1(serialize::parse)))), ( "receiver", Box::leak(Box::new(f_stack1(crate::types::container::pipe::channel::receiver))), ), ("recur", Box::leak(Box::new(recur))), ("resume", Box::leak(Box::new(identity))), ("reverse", Box::leak(Box::new(f_stack1(reverse)))), ("round", Box::leak(Box::new(f_stack1(round)))), ("second", Box::leak(Box::new(f_stack1(second)))), ( "sender", Box::leak(Box::new(f_stack1(crate::types::container::pipe::channel::sender))), ), ( "serversocket", Box::leak(Box::new(f_stack2_async(crate::types::container::pipe::net::server_socket))), ), ( "set", Box::leak(Box::new(f_stack1::<coll::Set, Item, Error>( coll::Set::try_derive, ))), ), ("set?", Box::leak(Box::new(f_stack1(is_set)))), ("sign", Box::leak(Box::new(f_stack2(crate::crypto::sign)))), ("sink", Box::leak(Box::new(sink))), ("slice", Box::leak(Box::new(f_stack3(slice)))), ( "socket", Box::leak(Box::new(f_stack2_async(crate::types::container::pipe::net::socket))), ), ("sqrt", Box::leak(Box::new(f_stack1(sqrt)))), ("standard", Box::leak(Box::new(standard))), ("step", Box::leak(Box::new(step))), ("string", Box::leak(Box::new(f_stack1(string)))), ("string?", Box::leak(Box::new(f_stack1(is_string)))), ("swap", Box::leak(Box::new(swap))), ("swapdown", Box::leak(Box::new(swapdown))), ( "timer", Box::leak(Box::new(f_stack1(crate::types::container::pipe::channel::timer))), ), ("timestamps", Box::leak(Box::new(timestamps))), ("unassign", Box::leak(Box::new(f_stack2(unassign)))), ("take", Box::leak(Box::new(take))), ("unwrap", Box::leak(Box::new(unwrap))), ("using", Box::leak(Box::new(f_result(using)))), ( "verify", Box::leak(Box::new(f_stack3(crate::crypto::verify))), ), //("version", Box::leak(Box::new(f_stack2(version)))), ("word", Box::leak(Box::new(f_stack1::<Word, Item, Error>( Word::try_derive, )))), ("word?", Box::leak(Box::new(f_stack1(is_word)))), ("wrap", Box::leak(Box::new(wrap))), ("xor", Box::leak(Box::new(f_stack2(xor)))), ("yes", Box::leak(Box::new(yes))), ("zero?", Box::leak(Box::new(f_stack1(is_zero)))), ]; #[cfg(feature = "database")] { entries.push(("database", Box::leak(Box::new(f_stack2(db::query))))); entries.push(("persist", Box::leak(Box::new(f_stack1(db::insert_object))))); } HashMap::from_iter(entries.into_iter().map(|(s, f)| (Word::derive(s), dict::Definition::Axiom(f)))) }; } fn pair(i: Item, j: Item) -> Item { list!(i, j).fit() } pub fn plus(i: Number, j: Number) -> Result<Number, Error> { Ok(i.add(j)) } pub fn minus(i: Number, j: Number) -> Result<Number, Error> { Ok(i.subtract(j)) } pub fn mult(i: Number, j: Number) -> Result<Number, Error> { Ok(i.multiply(j)) } pub fn div(i: Number, j: Number) -> Result<Number, Error> { match (i, j) { (Number::Int(i), Number::Int(j)) => i .checked_div(j) .ok_or_else(Error::division_by_zero) .map(Number::Int), (Number::Float(i), Number::Float(j)) => Number::divide(i, j).map(Number::Float), (Number::Int(i), Number::Float(j)) => Number::divide(i as Float, j).map(Number::Float), (Number::Float(i), Number::Int(j)) => Number::divide(i, j as Float).map(Number::Float), } } pub fn mod_(i: Int, j: Int) -> Result<Int, Error> { Ok(i % j) } pub fn floor(i: Number) -> Result<Int, Error> { match i { Number::Int(i) => Ok(i), Number::Float(i) => Ok(i.floor() as Int), } } pub fn ceiling(i: Number) -> Result<Int, Error> { match i { Number::Int(i) => Ok(i), Number::Float(i) => Ok(i.ceil() as Int), } } pub fn round(i: Number) -> Result<Int, Error> { match i { Number::Int(i) => Ok(i), Number::Float(i) => Ok(i.round() as Int), } } pub fn exp(base: Int, exponent: Int) -> Result<Int, Error> { base.checked_pow(exponent as u32).ok_or(Error::overflow()) } pub fn log(value: Int, base: Int) -> Result<Float, Error> { if base <= 1 { Err(Error::too_small(base, 1)) } else if value <= 0 { Err(Error::too_small(value, 0)) } else { let base = base as Float; let value = value as Float; Ok(value.log(base)) } } pub fn inc(i: Int) -> Result<Int, Infallible> { Ok(i + 1) } pub fn dec(i: Int) -> Result<Int, Infallible> { Ok(i - 1) } pub fn is_zero(i: Number) -> Result<bool, Infallible> { match i { Number::Int(i) => Ok(i == 0), Number::Float(i) => Ok(i == 0.0), } } pub fn is_empty(i: Item) -> Result<bool, Infallible> { Ok(i.is_empty()) } pub fn gt(i: Number, j: Number) -> Result<bool, Infallible> { Ok(Number::gt(i, j)) } pub fn lt(i: Number, j: Number) -> Result<bool, Infallible> { Ok(Number::lt(i, j)) } pub fn gte(i: Number, j: Number) -> Result<bool, Infallible> { Ok(Number::gte(i, j)) } pub fn lte(i: Number, j: Number) -> Result<bool, Infallible> { Ok(Number::lte(i, j)) } pub fn join(i: coll::Sized, j: coll::Sized) -> Result<coll::Sized, Error> { i.join(j) } pub fn put(mut env: Environment) -> Future<Environment> { let j = env.pop(); let i = env.pop(); let i2 = i.clone(); let pr = coll::Receptacle::try_derive(i); match pr { Ok(p) => Box::pin(p.put(j).map(|f| { match f { Ok(p) => { env.pop_prog(); env.push(Item::Receptacle(p)) } Err(e) => { env.push(i2); env.push(e) } }; env })), Err(e) => { env.push(i2); env.push(e); env.fit() } } } pub fn clone(mut env: Environment) -> Future<Environment> { let clone = env.stack.front().unwrap().clone(); env.pop_prog(); env.push(clone); env.fit() } fn swap2(mut env: Environment, offset: usize) -> Future<Environment> { env.stack.mutate().swap(offset, offset + 1); env.fit() } pub fn swap(mut env: Environment) -> Future<Environment> { env.pop_prog(); swap2(env, 0) } pub fn swapdown(mut env: Environment) -> Future<Environment> { env.pop_prog(); swap2(env, 1) } pub fn sink(mut env: Environment) -> Future<Environment> { let stack = env.stack.mutate(); stack.swap(0, 2); stack.swap(0, 1); env.pop_prog(); env.fit() } pub fn float(mut env: Environment) -> Future<Environment> { let stack = env.stack.mutate(); stack.swap(0, 2); stack.swap(1, 2); env.pop_prog(); env.fit() } pub fn drop(mut env: Environment) -> Future<Environment> { env.pop(); env.pop_prog(); env.fit() } pub fn eq(mut env: Environment) -> Future<Environment> { let i = env.pop(); let j = env.pop(); env.pop_prog(); env.push(i == j); env.fit() } pub fn count(i: coll::Sized) -> Result<Int, Infallible> { Ok(i.count() as Int) } pub fn is_string(i: Item) -> Result<bool, Infallible> { Ok(matches!( i, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::String(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::String(_))) )) } pub fn is_bytes(i: Item) -> Result<bool, Infallible> { Ok(matches!( i, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(_))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(_))) )) } pub fn is_error(i: Item) -> Result<bool, Infallible> { Ok(matches!( i, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Error(_), ))) )) } pub fn is_word(i: Item) -> Result<bool, Infallible> { Ok(matches!(i, Item::Word(_))) } pub fn is_environment(i: Item) -> Result<bool, Infallible> { Ok(matches!( i, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Env(_) ))) | Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative( assoc::Associative::Env(_) ))) )) } pub fn is_pipe(i: Item) -> Result<bool, Infallible> { Ok(matches!( i, Item::Dispenser(coll::Dispenser::Out(_)) | Item::Dispenser(coll::Dispenser::Tunnel(_)) | Item::Receptacle(coll::Receptacle::In(_)) | Item::Receptacle(coll::Receptacle::Tunnel(_)) )) } pub fn is_number(i: Item) -> Result<bool, Infallible> { Ok(matches!(i, Item::Number(_))) } pub fn is_list(i: Item) -> Result<bool, Infallible> { Ok(coll::Sized::try_derive(i) .map(|s| matches!(s, coll::Sized::List(_))) .unwrap_or(false)) } // pub fn first(c: coll::Sized) -> ItemResult { // let (_, i) = c.take(); // Ok(i.fit()) // } pub fn second(c: coll::List) -> ItemResult { Ok(c.get(1).cloned().unwrap_or_default()) } pub fn last(c: coll::Sized) -> ItemResult { Ok(c.into_iter().last().unwrap_or_default()) } pub fn loop_(mut env: Environment) -> Future<Environment> { let p = coll::List::try_derive(env.pop()); match p { Ok(mut p) => { env.pop_prog(); let f = env.pop(); if is_truthy(f) { let p2 = p.clone(); let pm = Arc::make_mut(&mut p); pm.push_back(Item::derive(p2)); pm.push_back("loop".fit()); env.program.prepend(p); } } Err(e) => env.push(e), } env.fit() } pub fn execute(mut env: Environment) -> Future<Environment> { let i = env.pop(); match coll::List::try_derive(i) { Ok(program) => { env.pop_prog(); env.program.prepend(program); } Err(e) => { env.push(e); } } env.fit() } pub fn wrap(mut env: Environment) -> Future<Environment> { let item = env.pop(); env.pop_prog(); env.push(list!(item)); env.fit() } pub fn unwrap(mut env: Environment) -> Future<Environment> { match coll::List::try_derive(env.pop()) { Ok(l) => { env.pop_prog(); env.stack.prepend_iter(l.iter().cloned().rev()); } Err(e) => { env.push(e); } }; env.fit() } /// If it's a word, don't bother wrapping and /// unwrapping, just flag it as quoted, and the /// evaluator will just push it unexamined. fn dip_quote(i: &mut Item) { if let Item::Word(ref mut w) = i { w.quoted = true; } } pub fn dip(mut env: Environment) -> Future<Environment> { match coll::List::try_derive(env.pop()) { Ok(program) => { let mut item = env.pop(); let expr = env.program.mutate(); expr.pop_front(); dip_quote(&mut item); expr.push_front(item); env.program.prepend(program) } Err(e) => env.push(e), } env.fit() } pub fn dipdown(mut env: Environment) -> Future<Environment> { match coll::List::try_derive(env.pop()) { Ok(program) => { let mut item1 = env.pop(); let mut item2 = env.pop(); let prog = env.program.mutate(); prog.pop_front(); dip_quote(&mut item1); dip_quote(&mut item2); prog.push_front(item1); prog.push_front(item2); env.program.prepend(program) } Err(e) => env.push(e), } env.fit() } pub fn take(mut env: Environment) -> Future<Environment> { // TODO: handle Nothing case let i = env.pop(); let i2 = i.clone(); let r = coll::Dispenser::try_derive(i); match r { Ok(d) => Box::pin(async move { let (i, c) = d.take().await; env.pop_prog(); let stack = Arc::make_mut(&mut env.stack); stack.push_front(c.fit()); stack.push_front(coll::result_to_option(i).unwrap_or_default()); env }), Err(e) => { let stack = Arc::make_mut(&mut env.stack); stack.push_front(i2); stack.push_front(e.fit()); env.fit() } } } pub fn pop(mut env: Environment) -> Future<Environment> { let i = env.pop(); let i2 = i.clone(); let s = coll::Sized::try_derive(i); match s { Ok(it) => { let (c, i) = it.pop(); env.pop_prog(); env.push(c); env.push(i.unwrap_or_default()); } Err(e) => { env.push(i2); env.push(e); } } env.fit() } pub fn is_truthy(i: Item) -> bool { match i { Item::Dispenser(coll::Dispenser::Sized(d)) => !d.is_empty(), Item::Receptacle(coll::Receptacle::Sized(r)) => !r.is_empty(), _ => true, } } pub fn branch(mut env: Environment) -> Future<Environment> { match ( coll::List::try_derive(env.pop()), coll::List::try_derive(env.pop()), ) { (Ok(false_branch), Ok(true_branch)) => { env.pop_prog(); let b = env.pop(); env.program.prepend(if is_truthy(b) { true_branch } else { false_branch }) } (Err(e), _) => env.push(e), (_, Err(e)) => env.push(e), } env.fit() } pub fn step(mut env: Environment) -> Future<Environment> { let p = coll::List::try_derive(env.pop()).unwrap(); let dispenser = coll::Dispenser::try_derive(env.pop()).unwrap(); Box::pin(async move { let (r, dispenser) = dispenser.take().await; if let Some(litem) = coll::result_to_option(r) { let prog = env.program.mutate(); // prepare the next iteration, even if the iterator is now // empty. step is still the next instruction, so we don't // pop it off. prog.push_front(p.clone().fit()); prog.push_front(dispenser.fit()); env.program.prepend(p); env.push(litem); } else { // if the container is empty, just pop off 'step' and we're done env.pop_prog(); } env }) } pub fn range(from: Int, to: Int, stepby: Int) -> Result<coll::List, Infallible> { Ok(coll::List::derive_iter( (from..to).step_by(stepby as usize).map(Item::derive), )) } // (effect [rec2 rec1 then pred] // ['[if] //[(concat rec1 // [[pred then rec1 rec2 'recur]] rec2) // then pred]]) pub fn recur(mut env: Environment) -> Future<Environment> { let mut rec2 = coll::List::try_derive(env.pop()).unwrap(); let mut rec1 = coll::List::try_derive(env.pop()).unwrap(); let then = coll::List::try_derive(env.pop()).unwrap(); let pred = coll::List::try_derive(env.pop()).unwrap(); env.pop_prog(); env.push_prog("if".fit()); let r = list!( pred.clone(), then.clone(), rec1.clone(), rec2.clone(), "recur", ) .fit(); // I think i did this right - used to create a new list and extend // it with rec1, then push r, then extend again with rec2. now // start with rec1 (copied on write), then push r, then extend // with rec2. That should be equivalent. let rm = Arc::make_mut(&mut rec1); rm.push_back(r); rm.extend(Arc::make_mut(&mut rec2).drain(..)); //env.pop_expr(); env.push(pred); env.push(then); env.push(rec1); env.fit() } //(fn [{[l & others] 'stack :as env}] // (assoc env 'stack (apply list (vec others) l))) pub fn evert(mut env: Environment) -> Future<Environment> { let mut l = coll::List::try_derive(env.pop()).unwrap(); mem::swap(&mut env.stack, &mut l); env.pop_prog(); env.push(l); env.fit() } fn assoc_in(i: Option<Item>, ks: &[assoc::KeyItem], v: Item) -> Result<Item, Error> { fn assoc_vec(mut l: coll::List, ks: &[assoc::KeyItem], k: Int, v: Item) -> Result<Item, Error> { let lm = Arc::make_mut(&mut l); let idx = k as usize; // extend the size of the vector to be big enough if lm.len() <= idx { lm.resize(idx + 1, Item::default()); } lm[idx] = if ks.is_empty() { v } else { assoc_in(lm.get(idx).cloned(), ks, v)? }; Ok(l.fit()) } fn assoc_map( a: assoc::Associative, ks: &[assoc::KeyItem], k: &assoc::KeyItem, v: Item, ) -> Result<Item, Error> { let inner = a.get(&k).clone(); if ks.is_empty() { Ok(a.insert(k.clone(), v).0.fit()) } else { Ok(a.insert(k.clone(), assoc_in(inner, ks, v)?).0.fit()) } } if let [k, ks @ ..] = ks { match (i, k) { // An int key for a list means update that index ( Some(Item::Dispenser(coll::Dispenser::Sized(coll::Sized::List(l)))), assoc::KeyItem::Int(k), ) => assoc_vec(l, ks, *k, v), ( Some(Item::Receptacle(coll::Receptacle::Sized(coll::Sized::List(l)))), assoc::KeyItem::Int(k), ) => assoc_vec(l, ks, *k, v), // An int key for an associative means an integer key, which is uncommon // but we'll support it ( Some(Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative(a)))), assoc::KeyItem::Int(k), ) => assoc_map(a, ks, &assoc::KeyItem::Int(*k), v), ( Some(Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative(a)))), assoc::KeyItem::Int(k), ) => assoc_map(a, ks, &assoc::KeyItem::Int(*k), v), // An int key for a non-sized type means we're overwriting // whatever it is with a list, with the value at that index (_, assoc::KeyItem::Int(k)) => assoc_vec(coll::List::fresh(), ks, *k, v), // Where there was nothing at a given index/key, and a non-int // key, create a map (None, k) => assoc_map( assoc::Associative::Assoc(assoc::Association::fresh()), ks, k, v, ), // Whatever it is, treat it as a map if possible (Some(i), k) => { let a = assoc::Associative::try_derive(i)?; assoc_map(a, ks, k, v) } } } else { Ok(i.unwrap()) } } fn unassoc_in(i: Item, ks: &[assoc::KeyItem]) -> Result<Item, Error> { if let [k, ks @ ..] = ks { if ks.is_empty() { let a = assoc::Associative::try_derive(i)?; Ok(a.remove(k).0.fit()) } else { match (i, k) { ( Item::Dispenser(coll::Dispenser::Sized(coll::Sized::List(mut l))), assoc::KeyItem::Int(k), ) => { let lm = l.mutate(); let old_value = if let Some(item) = lm.get_mut(*k as usize) { mem::take(item) } else { return Err(Error::short_list(*k)); // replace with your error }; let new_value = unassoc_in(old_value, ks)?; lm[*k as usize] = new_value; Ok(l.fit()) } (a, k) => { let a: assoc::Associative = a.try_fit()?; let mut a = assoc::Association::derive_iter(a.to_iter()); let am = a.mutate(); let mut res: Option<Result<_, Error>> = None; am.entry(k.clone()).and_modify(|v| { let new_value = unassoc_in(v.clone(), ks); res = Some(new_value.map(|nv| { *v = nv; })); }); if let Some(Err(e)) = res { return Err(e); } Ok(a.fit()) } } } } else { Ok(i) } } pub fn assign(m: Item, ks: Item, v: Item) -> ItemResult { //println!("Assign! {:?}", m); let kit = coll::List::try_derive(ks)?; let mut ksvec: assoc::KeyList = assoc::KeyList::try_from_iter(kit.iter().cloned())?; ksvec.mutate().make_contiguous(); let (ks, _) = ksvec.as_slices(); assoc_in(Some(m), ks, v) } pub fn unassign(m: Item, ks: Item) -> ItemResult { let kit = coll::List::try_derive(ks)?; let mut ksvec: assoc::KeyList = assoc::KeyList::try_from_iter(kit.iter().cloned())?; ksvec.mutate().make_contiguous(); let (ks, _) = ksvec.as_slices(); unassoc_in(m, ks) } pub fn lookup(i: coll::Sized, k: assoc::KeyItem) -> ItemResult { //println!("lookup {:?} \n {:?}", i, k); //let k = assoc::KeyItem::try_derive(k)?; //let i = coll::Sized::try_derive(i)?; match (i, k) { (coll::Sized::List(l), assoc::KeyItem::Int(k)) => { Ok(l.get(k as usize).cloned().unwrap_or_default()) } (coll::Sized::String(s), assoc::KeyItem::Int(k)) => { //let s = s.inner(); s.chars() .nth(k as usize) .map_or(Ok(Item::default()), |c| Ok(c.fit())) } (coll::Sized::Bytes(b), assoc::KeyItem::Int(k)) => b .get(k as usize) .cloned() .map_or(Ok(Item::default()), |c| Ok((c as i64).fit())), (i, k) => { let m = assoc::Associative::try_derive(i)?; Ok(m.get(&k).unwrap_or_default()) } } } pub fn contains(c: Item, i: Item) -> Result<bool, Infallible> { match coll::Sized::try_derive(c) { Ok(c) => Ok(c.has(&i)), Err(_) => Ok(false), } } pub fn or(i: Item, j: Item) -> ItemResult { Ok(if is_truthy(i.clone()) { i } else if is_truthy(j.clone()) { j } else { Item::default() }) //Ok(Item::derive(is_truthy(i) || is_truthy(j))) } pub fn and(i: Item, j: Item) -> ItemResult { Ok(if is_truthy(i) && is_truthy(j.clone()) { j } else { Item::default() }) } pub fn not(i: Item) -> ItemResult { Ok(Item::derive(!is_truthy(i))) } pub fn is_association(i: Item) -> Result<bool, Error> { Ok(coll::Sized::try_derive(i) .map(|s| matches!(s, coll::Sized::Associative(_))) .unwrap_or(false)) } pub fn is_set(i: Item) -> Result<bool, Error> { Ok(coll::Sized::try_derive(i) .map(|s| matches!(s, coll::Sized::Set(_))) .unwrap_or(false)) } pub fn is_odd(i: Int) -> Result<bool, Error> { Ok(i & 1 == 1) } pub fn is_even(i: Int) -> Result<bool, Error> { Ok(i & 1 == 0) } pub fn decide(mut env: Environment) -> Future<Environment> { let mut clauses = coll::List::try_derive(env.pop()).unwrap(); let clauses_data = Arc::make_mut(&mut clauses); let clause = clauses_data.pop_front(); if let Some(clause) = clause { let clause: Result<coll::List, Error> = clause.try_fit(); match clause { Ok(mut clause) => { if clause.len() != 2 { env.push(Error::list_count(2)); } else { let clause_data = clause.mutate(); let test: Result<coll::List, Error> = clause_data .pop_front() .ok_or(Error::list_count(2)) .and_then(|i| i.try_fit()); let expr: Result<coll::List, Error> = clause_data .pop_front() .ok_or(Error::list_count(2)) .and_then(|i| i.try_fit()); match (test, expr) { (Ok(test), Ok(expr)) => { // construct if let testp = list!(test, "shield"); let newexpr = list!( testp, expr, list!(clauses, "decide"), list!("shield"), // This is the definition of 'dipdown' // which we don't want to depend on so // early in bootstrapping "wrap", list!("dip"), "join", "dip", // end 'dipdown' "branch" ); env.pop_prog(); env.program.prepend(newexpr); } (Err(test), _) => { env.push(test); } (_, Err(expr)) => { env.push(expr); } } } } Err(e) => { env.push(e); } } } else { // clauses empty, return nothing env.pop_prog(); env.push(Item::default()); } env.fit() } pub fn emit(l: coll::List) -> ItemResult { Ok(Item::Dispenser(coll::Dispenser::Sized( coll::Sized::String(l.iter().emit()), ))) } pub fn autoformat(i: Item) -> ItemResult { let s = String::try_derive(i)?; Ok(Item::Dispenser(coll::Dispenser::Sized( coll::Sized::String(serialize::auto_format(s.as_str(), 20, 80)), ))) } pub fn eval_step(mut env: Environment) -> Future<Environment> { //println!("{:?}", env); //println!("Dictionary size: {}", env.dictionary.len()); let next_item = env.program.front(); if let Some(val) = next_item { match val { Item::Word(word) => { if word.quoted { // word was quoted (see axiom::dip), just push onto stack // and remove the quotedness let mut w = word.clone(); env.pop_prog(); w.quoted = false; env.push(w); env.fit() } else { if let Some(dfn) = env.dictionary.cache.get(&word) { { if let Some(spec) = &dfn.spec { //println!("Checking spec for {:?}: {:?}", word, spec.0); if let Err(e) = env.check_input_spec(&spec.0) { env.push(e); return env.fit(); } } else { // println!("No spec for {}!", word); } match &dfn.definition { dict::Definition::Axiom(a) => (*a)(env), dict::Definition::Derived(d) => { let items = d.clone(); env.pop_prog(); env.program.prepend(items); env.fit() } } } } else { //let w = word.clone(); if *word == Word::derive("times5") { println!("Failed dict lookup: {:?} ", env.dictionary.cache); } env.push(Error::undefined(word.clone().fit())); env.fit() } } } _ => { // not a word, just push onto stack let i = env.pop_prog(); env.push(i); env.fit() } } } else { env.push(Error::short_list(1)); env.fit() } } fn reverse(s: coll::Sized) -> Result<coll::Sized, Error> { match s { coll::Sized::List(mut l) => Ok({ l.reverse(); l.fit() }), coll::Sized::String(s) => Ok(s.chars().rev().collect::<String>().fit()), coll::Sized::Bytes(b) => Ok(b.into_iter().rev().collect::<Vec<u8>>().fit()), s => Err(Error::expected("ordered", s)), } } fn encode_string(s: String) -> Result<Bytes, Infallible> { Ok(s.as_bytes().to_vec()) } fn encode_number(n: Number) -> Result<Bytes, Infallible> { match n { Number::Int(i) => Ok(i.to_be_bytes().to_vec()), Number::Float(f) => Ok(f.to_be_bytes().to_vec()), } } fn string(i: Item) -> Result<String, Infallible> { match i { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(b))) => { Ok(std::str::from_utf8(&b).unwrap().to_string()) } Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(b))) => { Ok(std::str::from_utf8(&b).unwrap().to_string()) } Item::Dispenser(coll::Dispenser::Sized(s)) => { if s.is_empty() { Ok("".to_string()) } else { Ok((&Item::derive(s)).emit()) } } Item::Receptacle(coll::Receptacle::Sized(s)) => { if s.is_empty() { Ok("".to_string()) } else { Ok((&Item::derive(s)).emit()) } } i => Ok((&Item::derive(i)).emit()), } } fn get_error(env: &Environment) -> Option<Error> { env.stack.front().and_then(|i| match i { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Error(e), ))) => Some(e.clone()), Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative( assoc::Associative::Error(e), ))) => Some(e.clone()), _ => None, }) } fn unwind(mut env: Environment) -> Environment { let err = env.pop(); let handle: &Item = &"handle".fit(); let err = match err { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Error(mut e), ))) => { let mut next = env.program.front(); let data = e.data.mutate(); let mut unwound_arc: coll::List = data .remove(&"unwound".fit()) .unwrap_or_default() .try_fit() .unwrap_or_else(|_| list!()); let unwound = unwound_arc.mutate(); while next.is_some() && next.unwrap() != handle { let i = env.pop_prog(); unwound.push_back(i); next = env.program.front(); } if next.is_some() { // didn't unwind the whole program, handled error env.pop_prog(); // set the is_handled bit e.is_handled = true; } let em = Arc::make_mut(&mut e.data); em.insert("unwound".fit(), unwound_arc.fit()); e.fit() } i => i, }; env.push(err); env } pub async fn eval(mut env: Environment) -> Environment { loop { if let Some(err) = get_error(&env) { if !err.is_handled { env = unwind(env); // TODO: this should be done in eval_step }; } if !env.program.is_empty() { env = eval_step(env).await; } else { break; } } env } pub fn eval_step_outer(mut env: Environment) -> Future<Environment> { let tos = env.pop(); let inner_env = Environment::try_derive(tos); match inner_env { Ok(inner) => { env.pop_prog(); if inner.program.is_empty() { Box::pin(async move { env.push(Item::default()); env }) } else { Box::pin(eval_step(inner).map(|inner_next| { env.push(inner_next); env })) } } Err(e) => { env.push(e); env.fit() } } } pub fn evaluate(mut env: Environment) -> Future<Environment> { let tos = env.pop(); let inner_env = Environment::try_derive(tos); match inner_env { Ok(inner) => Box::pin(eval(inner).map(|inner_done| { env.pop_prog(); env.push(inner_done); env })), Err(e) => { env.push(e); env.fit() } } } pub fn identity(mut env: Environment) -> Future<Environment> { env.pop_prog(); env.fit() } pub fn dictionary(mut env: Environment) -> Future<Environment> { //println!("adding dictionary"); let d = env.dictionary.clone(); env.pop_prog(); env.push(d); env.fit() } fn sqrt(i: Number) -> Result<Number, Infallible> { Ok(i.sqrt()) } fn abs(i: Number) -> Result<Number, Infallible> { Ok(i.abs()) } /// If there's an unhandled error on the stack, handle it, otherwise /// no-op. fn handle(mut env: Environment) -> Future<Environment> { env.pop_prog(); match env.stack.mutate().pop_front() { None => {} Some(i) => match i { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative( assoc::Associative::Error(mut e), ))) => { e.is_handled = true; env.push(e); } Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Associative( assoc::Associative::Error(mut e), ))) => { e.is_handled = true; env.push(e); } i => env.push(i), }, }; env.fit() } /// Makes 'yes' a word that doesn't have to be quoted, just pushes /// itself onto the stack. pub fn yes(mut env: Environment) -> Future<Environment> { let t = env.pop_prog(); env.push(t); env.fit() } pub fn fail(env: &mut Environment) -> Result<(), Error> { let mut err = Error::try_derive(env.pop())?; err.is_handled = false; env.pop_prog(); env.push(err); Ok(()) } /// Takes a dictionary diff, merges it into an existing dictionary, /// with all the changes marked with the given namespace. pub fn dictmerge(env: &mut Environment) -> Result<(), Error> { //println!("dictmerge: {:?}", env); let modified = dict::Dictionary::try_derive(env.pop())?; let mut existing = dict::Dictionary::try_derive(env.pop())?; let ns = env.pop(); let namespace = dict::Namespace::try_derive(ns)?; existing.merge(modified, &namespace); // not sure if this is really needed - was here for use during // bootstrapping when some builtins are not yet loaded but need to // be // if namespace.is_none() { // env.dictionary.words.add_builtins(); // } // pop the word dictmerge env.pop_prog(); env.push(existing); Ok(()) } /// Fetches a binary blob from the cache. The top of stack should be /// either the hash of the content or its alias (a [Word]). pub fn read_blob(env: &mut Environment) -> Result<(), Error> { //println!("Env: {:?}", env); let cache = config::PlatformConfig::get()?.cache; let contents = match env.pop() { Item::Word(alias) => cache.get(&cache::Key::Alias(alias.fit()))?, i => { let hash = Bytes::try_derive(i)?; cache.get(&cache::Key::Hash(hash))? } }; env.pop_prog(); env.push(contents); Ok(()) } /// Writes a given binary object to the cache. Supports [Bytes], and /// certain kinds of pipes. The top of stack should be the alias to /// store the contents under, which should be either a [Word] or /// nothing. If nothing, the object will only be available via its /// hash. Returns the hash. pub fn write_blob(env: &mut Environment) -> Result<(), Error> { let alias = match env.pop() { Item::Word(w) => Some(w.fit()), _ => None, }; match env.pop() { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(b))) => { let cache = config::PlatformConfig::get()?.cache; let hash = cache.put(&b, alias)?; env.push(hash); env.pop_prog(); Ok(()) } i => Err(Error::expected("bytes", i)), } } /// Takes an inner environment from the top of the stack, and spawns a /// tokio task to evaluate that environment. pub fn animate(mut env: Environment) -> Future<Environment> { let tos = env.pop(); let inner_env = Environment::try_derive(tos); match inner_env { Ok(inner) => { env.pop_prog(); tokio::spawn(async move { eval(inner).await }); env.fit() } Err(e) => { env.push(e); env.fit() } } } fn xor_(i: Bytes, j: Bytes) -> Bytes { let len = std::cmp::max(i.len(), j.len()); let mut result = Vec::with_capacity(len); for (byte_i, byte_j) in i .iter() .chain(std::iter::repeat(&0).take(len - i.len())) .zip(j.iter().chain(std::iter::repeat(&0).take(len - j.len()))) { result.push(byte_i ^ byte_j); } result } pub fn xor(i: Item, j: Item) -> ItemResult { match (i, j) { (Item::Number(Number::Int(i)), Item::Number(Number::Int(j))) => { Ok(Item::Number(Number::Int(i ^ j))) } ( Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(i))), Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(j))), ) => Ok(xor_(i, j).fit()), ( Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(i))), Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(j))), ) => Ok(xor_(i, j).fit()), (i, j) => Err(Error::expected("integers", pair(i, j))), } } pub fn inspect(i: Item) -> Result<String, Infallible> { Ok(format!("{:?}", i)) } pub fn timestamps(mut env: Environment) -> Future<Environment> { env.pop_prog(); env.push(Item::Dispenser(coll::Dispenser::Out(pipe::Out::Time))); env.fit() } pub fn standard(mut env: Environment) -> Future<Environment> { env.pop_prog(); env.push(Item::Dispenser(coll::Dispenser::Tunnel( pipe::Tunnel::Standard, ))); env.fit() } pub fn intersection(i: Item, j: Item) -> ItemResult { let i = coll::Set::try_derive(i)?; let j = coll::Set::try_derive(j)?; let ij = i.intersection(&j); let h = std::collections::HashSet::from_iter(ij.cloned()); Ok(coll::Set::derive(h).fit()) } pub fn compare(i: Item, j: Item) -> ItemResult { let ki = assoc::KeyItem::try_derive(i)?; let kj = assoc::KeyItem::try_derive(j)?; match ki.partial_cmp(&kj) { Some(std::cmp::Ordering::Less) => Ok("less".fit()), Some(std::cmp::Ordering::Equal) => Ok("equal".fit()), Some(std::cmp::Ordering::Greater) => Ok("greater".fit()), None => Err(Error::expected("comparable", pair(ki.fit(), kj.fit()))), } } fn as_pair(i: Item) -> Result<(Item, assoc::KeyItem), Error> { let mut i = coll::List::try_derive(i)?; let im = i.mutate(); let j = im.pop_front().ok_or(Error::short_list(1))?; let k = im .pop_front() .ok_or(Error::short_list(2)) .and_then(assoc::KeyItem::try_derive)?; Ok((j, k)) } pub fn sort_by_key(l: coll::Sized) -> Result<coll::List, Error> { let it = l.into_iter().map(as_pair); let mut it = it.collect::<Result<Vec<(Item, assoc::KeyItem)>, Error>>()?; it.sort_unstable_by(|(_, a), (_, b)| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Less)); Ok(coll::List::derive_iter(it.into_iter().map(|(k, _)| k))) } fn slice(arr: Item, start: Item, end: Item) -> ItemResult { //println!("Start: {:?}, End: {:?}", start, end); let arr = coll::Sized::try_derive(arr)?; let mut start = Int::try_derive(start)?; let mut end = Int::try_derive(end)?; if start < 0 { start += arr.count() as i64; } if end <= 0 { end += arr.count() as i64; } if start < 0 { return Err(Error::negative(start)); } if end < 0 { return Err(Error::negative(end)); } if start > end { return Err(Error::create( list!("<="), "invalid index range", Some(pair(start.fit(), end.fit())), )); } match arr { coll::Sized::Bytes(arr) => Ok(arr .get(start as usize..end as usize) .map(|a| a.to_vec()) .fit()), coll::Sized::String(arr) => Ok(arr .get(start as usize..end as usize) .map(|a| a.to_string()) .fit()), coll::Sized::List(arr) => { let owned_subset: VecDeque<Item> = arr .iter() .skip(start as usize) .take(end as usize - start as usize) .cloned() .collect(); Ok(coll::List::derive(owned_subset).fit()) } i => Err(Error::expected("ordered", i)), } } fn empty(s: Item) -> ItemResult { let s = coll::Sized::try_derive(s)?; Ok(s.empty().fit()) } fn format(fstr: Item, items: Item) -> ItemResult { let fstr = String::try_derive(fstr)?; let items = coll::List::try_derive(items)?; let strings = items .iter() .cloned() .map(String::try_derive) .collect::<Result<Vec<String>, Error>>()?; Ok(SimpleCurlyFormat .format(fstr.as_str(), strings.as_slice())? .into_owned() .fit()) } impl<'a> From<dynfmt::Error<'a>> for Error { fn from(err: dynfmt::Error) -> Error { Error::create(list!("format"), &err.to_string(), Option::<Item>::None) } } fn decode_json(s: Item) -> ItemResult { let s = String::try_derive(s)?; Ok(serde_json::from_str::<Item>(s.as_str())?) } fn encode_json(i: Item) -> ItemResult { Ok(Item::derive(serde_json::to_string(&i)?)) } fn is_finished(env: Environment) -> Result<bool, Infallible> { Ok(env.is_finished()) } fn using(env: &mut Environment) -> Result<(), Error> { let mut namespaces: Vec<dict::Namespace> = env.pop().try_fit()?; let mut inner_env: Environment = env.pop().try_fit()?; env.pop_prog(); mem::swap(&mut inner_env.dictionary.modules, &mut namespaces); inner_env.dictionary.modules.extend(namespaces); env.push(inner_env); Ok(()) } fn pack(env: &mut Environment) -> Result<(), Error> { let template: coll::List = env.pop().try_fit()?; let mut last_stack_item_used = 0; fn stackpoint(w: Word) -> Option<(usize, bool)> { let s = String::derive(w.clone()); //println!("string! {:?}", s); if let Some(slot) = s.strip_prefix("**") { slot.parse::<usize>() .ok() .and_then(|n| if n > 0 { Some((n, true)) } else { None }) } else if let Some(slot) = s.strip_prefix("*") { slot.parse::<usize>() .ok() .and_then(|n| if n > 0 { Some((n, false)) } else { None }) } else { None } } /// Takes an accumulator that is a pair of the stack and the /// compiled result so far, and then an item to incorporate. If /// the item is a template placeholder, inserts or splices the /// item from the stack. fn splice( acc: Result<(Stack, &mut usize, Vec<Item>), Error>, i: Item, ) -> Result<(Stack, &mut usize, Vec<Item>), Error> { match acc { Ok((stack, last_stack_item_used, mut list)) => match i { Item::Word(w) => { if let Some((n, is_splice)) = stackpoint(w.clone()) { *last_stack_item_used = max(n, *last_stack_item_used); let r = stack .get(n - 1) .ok_or_else(|| Error::list_count(n as Int)) .cloned(); match r { Ok(item) => { if is_splice { match item { Item::Dispenser(coll::Dispenser::Sized(s)) => { list.extend(s) } Item::Receptacle(coll::Receptacle::Sized(s)) => { list.extend(s) } i => return Err(Error::expected("sized", i)), } } else { list.push(item); } Ok((stack, last_stack_item_used, list)) } Err(e) => Err(e), } } else { list.push(Item::Word(w.clone())); Ok((stack, last_stack_item_used, list)) } } Item::Dispenser(coll::Dispenser::Sized(coll::Sized::List(l))) => { // recurse let (stack, last_stack_item_used, filled) = l.iter().cloned().fold( Ok((stack.clone(), last_stack_item_used, Vec::new())), splice, )?; list.push(coll::List::derive(filled).fit()); Ok((stack, last_stack_item_used, list)) } i => { list.push(i); Ok((stack, last_stack_item_used, list)) } }, Err(e) => Err(e), } } let (_, last_stack_item_used, filled) = template.iter().cloned().fold( Ok((env.stack.clone(), &mut last_stack_item_used, Vec::new())), splice, )?; env.pop_prog(); // pop all the used items for _ in 0..*last_stack_item_used { env.pop(); } env.push(coll::List::derive(filled)); Ok(()) }
1.5.6. Top level library
Here is the top level for using kcats as a library, either in another rust project or some other language through FFI.
pub mod axiom; mod crypto; pub mod serialize; pub mod traits; pub mod types; #[cfg(target_os = "android")] mod android { use crate::axiom; use crate::config::PlatformConfig; use crate::serialize::{self, Emit}; use crate::types::container::environment::Environment; use std::ffi::CString; use std::path::PathBuf; use cache::cache; use jni::objects::{JClass, JString}; use jni::sys::jstring; use jni::JNIEnv; use libc::c_char as lc_char; #[link(name = "log")] extern "C" { fn __android_log_print(prio: i32, tag: *const lc_char, fmt: *const lc_char, ...) -> i32; } const ANDROID_LOG_INFO: i32 = 4; pub fn log(message: &str) { let tag = CString::new("kcats").unwrap(); let message = CString::new(message).unwrap(); unsafe { __android_log_print(ANDROID_LOG_INFO, tag.as_ptr(), message.as_ptr()); } } #[no_mangle] pub extern "system" fn Java_org_skyrod_subverse_MainActivity_kcatsEval<'local>( mut jnienv: JNIEnv<'local>, _class: JClass<'local>, env: *mut Environment, program: JString, ) -> jstring { log("Starting eval"); let mut program: String = jnienv .get_string(&program) .expect("Couldn't get java string!") .into(); // to ensure errors are handled by the repl- so that the // user can continue with more input. program.push_str(" handle"); log(format!("Got program {:?}", program).as_str()); if env.is_null() { return jnienv .new_string("Invalid environment pointer") .unwrap() .as_raw(); } log("Taking pointer ownership"); // Take ownership of the Environment unsafe { let mut env_val = std::ptr::read(env); log("Parsing input"); match serialize::parse_input(&mut env_val, program) { Ok(_) => { // Execute the eval and re-assign the result back to the env pointer log("Executing environment"); env_val = futures::executor::block_on(async move { axiom::eval(env_val).await }); log("Formatting result"); let result = serialize::auto_format(env_val.stack.iter().emit().as_str(), 20, 80); // Write the updated environment back to the pointer std::ptr::write(env, env_val); // Convert the evaluation result back to a C string jnienv.new_string(result).unwrap().as_raw() } Err(e) => jnienv .new_string(format!("Error: {:?}", e)) .unwrap() .as_raw(), } } } #[no_mangle] pub extern "system" fn Java_org_skyrod_subverse_MainActivity_kcatsNew<'local>( mut jnienv: JNIEnv<'local>, _class: JClass<'local>, cachepath: JString, dbfile: JString, ) -> *mut Environment { //panic!("oh noes"); log("creating new kcats env"); let cacheloc: String = jnienv .get_string(&cachepath) .expect("Couldn't get java string!") .into(); log("creating new cache"); let cache = cache::Cache::new(PathBuf::from(cacheloc)).expect("Valid cache location"); let dbloc: String = jnienv .get_string(&dbfile) .expect("Couldn't get java string!") .into(); log("setting platform config"); let result = std::panic::catch_unwind(|| { // Your potentially panicking code here PlatformConfig::init(PathBuf::from(dbloc), cache).expect("Failed platform init"); Box::into_raw(Box::new(Environment::default())) }); match result { Ok(value) => return value, Err(e) => { if let Some(s) = e.downcast_ref::<String>() { log(format!("Panic occurred: {}", s).as_str()); } else if let Some(s) = e.downcast_ref::<&str>() { log(format!("Panic occurred: {}", s).as_str()); } // Handle the panic } } panic!("uh oh"); } #[no_mangle] pub extern "C" fn Java_org_skyrod_subverse_MainActivity_katsFree(env: *mut Environment) { log("FREE"); if !env.is_null() { unsafe { drop(Box::from_raw(env)); } } } } pub mod config { use crate::types::container::error::Error; use crate::types::Item; use cache::cache; use directories::ProjectDirs; use std::path::Path; use std::path::PathBuf; use std::sync::Arc; use lazy_static::lazy_static; use std::sync::RwLock; lazy_static! { pub static ref PLATFORM_CONFIG: RwLock<Option<PlatformConfig>> = { //println!("Creating PLATFORM_CONFIG at {}:{}", file!(), line!()); RwLock::new(None) }; } /// A configuration struct for the platform we're running on, /// specifies where some filesystem resources are located. On some /// platforms (like android) we can't guess and will only know at /// runtime. #[derive(Clone, Debug)] pub struct PlatformConfig { pub database: Option<Arc<PathBuf>>, pub cache: Arc<cache::Cache>, } impl PlatformConfig { pub fn init(database: PathBuf, cache: cache::Cache) -> Result<(), Error> { //println!("Initializing with {:?} and {:?}", database, cache); let mut config = PLATFORM_CONFIG.write().unwrap(); *config = Some(PlatformConfig { database: Some(Arc::new(database)), cache: Arc::new(cache), }); Ok(()) } pub fn get() -> Result<PlatformConfig, Error> { let config = PLATFORM_CONFIG.read().unwrap(); //println!("Getting platform config: {:?}", config); config .as_ref() .ok_or(Error::expected("initialization", None::<Item>)) .map(|c| c.clone()) } } /// If we call this function it's because kcats is running as a binary /// and we can figure out storage locations without outside input. pub fn configure_platform() { //println!("Configure platform"); let project_dirs = ProjectDirs::from("org", "skyrod", "kcats").unwrap(); let project_dir = project_dirs.data_dir(); std::fs::create_dir_all(project_dir).unwrap(); let db_file = project_dir.join("kcats-database.db"); let cache_dir = ProjectDirs::from("org", "skyrod", "kcats") .map(|proj_dirs| proj_dirs.data_dir().join("cache")) .unwrap_or_else(|| Path::new(".").join("cache")); PlatformConfig::init(db_file, cache::Cache::new(cache_dir).unwrap()).unwrap(); } } #[cfg(test)] mod tests { //! Unit tests, in the form of all the examples of usage of the //! different lexicon words. Examples are all in the form of two //! programs that should be equivalent, something like `2 3 +` and //! `5`. Runs both programs in separate environments, compares the //! resulting stack to ensure they are equal. // Note this useful idiom: importing names from outer (for mod tests) scope. //use super::error::Error; //use super::*; use crate::axiom; use crate::list; use crate::serialize::Emit; use crate::traits::*; use crate::types::container as coll; use crate::types::container::Ordered; use crate::types::container::{environment::Environment, error::Error}; use crate::types::{Item, Word}; use test_case::test_case; pub fn get_item(i: &coll::List, index: usize) -> Option<Item> { i.get(index).cloned() } #[tokio::main] async fn test_example( mut prog_env: Environment, program: coll::List, expected: coll::List, description: Option<String>, ) -> Option<Error> { let mut exp_env = prog_env.clone(); prog_env.program.prepend(program.clone()); exp_env.program.prepend(expected.clone()); let p_fut = tokio::spawn(async move { axiom::eval(prog_env).await }); let exp_fut = tokio::spawn(async move { axiom::eval(exp_env).await }); let (prog_env, exp_env) = tokio::join!(p_fut, exp_fut); let prog_env = prog_env.unwrap(); let exp_env = exp_env.unwrap(); if prog_env.stack == exp_env.stack { if let Some(description) = description { println!("PASSED: '{}'", description); } else { println!( "PASSED: expected {} got {}", (exp_env.stack.iter().emit()), (prog_env.stack.iter().emit()) ); } None } else { println!( "\nFAILED: '{}'\nEXPECTED: {}\nACTUAL: {}\n", description.unwrap_or_default(), (exp_env.stack.iter().emit()), (prog_env.stack.iter().emit()) ); // println!( // "Debug: expected {:?} got {:?}", // exp_env.stack, prog_env.stack // ); Some(Error::test_assertion(program, expected, prog_env.stack)) } } fn test_word(standard_env: Environment, w: Word) -> Vec<Error> { if let Some(d) = standard_env.dictionary.words.get(&w.clone().fit()) { d.examples .clone() .unwrap() .iter() .filter_map(|ex| { let l = coll::List::try_derive(ex.clone()).unwrap(); let p = coll::List::try_derive(get_item(&l, 0).unwrap()); let exp = coll::List::try_derive(get_item(&l, 1).unwrap()); let description = get_item(&l, 2).and_then(|i| String::try_derive(i).ok()); match (p, exp) { (Ok(p), Ok(exp)) => test_example(standard_env.clone(), p, exp, description), (Err(e), _) => Some(e), (_, Err(e)) => Some(e), } }) .collect::<Vec<Error>>() } else { vec![Error::create( list!("dictionary", list!(w.clone()), "lookup"), "word is not defined", None::<Item>, )] } } #[test_case("+" ; "plus")] #[test_case("-" ; "minus")] #[test_case("=" ; "eq")] #[test_case(">" ; "gt")] #[test_case(">=" ; "gte")] #[test_case("<" ; "lt")] #[test_case("<=" ; "lte")] #[test_case("*" ; "mult")] #[test_case("/" ; "divide")] #[test_case("abs")] #[test_case("addmethod")] #[test_case("and")] #[test_case("any?" ; "is_any")] #[test_case("assemble")] #[test_case("assign")] #[test_case("association")] #[test_case("association?" ; "is_association")] #[test_case("bail")] #[test_case("bits")] #[test_case("both?" ; "is_both")] #[test_case("branch")] #[test_case("butlast")] #[test_case("bytes?" ; "is_bytes")] #[test_case("catcher")] #[test_case("ceiling")] #[test_case("clone")] #[test_case("clonedown")] #[test_case("clonedeep")] #[test_case("collect")] #[test_case("compare")] #[test_case("contains?" ; "contains")] #[test_case("count")] #[test_case("cram")] #[test_case("cut")] #[test_case("dec")] #[test_case("decide")] #[test_case("decodejson")] #[test_case("decorate")] #[test_case("decorated")] #[test_case("definition")] #[test_case("dip")] #[test_case("dipdown")] #[test_case("dipdeep")] #[test_case("dive")] #[test_case("divedown")] #[test_case("divedeep")] //#[test_case("draft")] #[test_case("drop")] #[test_case("dropdown")] #[test_case("dropdeep")] #[test_case("dropper")] #[test_case("each")] #[test_case("emit")] #[test_case("ends?" ; "is_ends")] #[test_case("encode")] #[test_case("encodejson")] #[test_case("encodestring")] #[test_case("encodenumber")] #[test_case("environment")] #[test_case("environment?" ; "is_environment")] #[test_case("evaluate")] #[test_case("eval-step")] #[test_case("even?" ; "is_even")] #[test_case("evert")] #[test_case("every?" ; "is_every")] #[test_case("execute")] #[test_case("exp")] #[test_case("filter")] #[test_case("finished?" ; "is_finished")] #[test_case("first")] #[test_case("flatten")] #[test_case("flip")] #[test_case("float")] #[test_case("floor")] #[test_case("fold")] #[test_case("format")] #[test_case("frequencies")] #[test_case("future")] #[test_case("get")] #[test_case("generate")] #[test_case("group")] #[test_case("hashbytes")] #[test_case("if")] #[test_case("inc")] #[test_case("indexed")] #[test_case("indexer")] #[test_case("indexof")] #[test_case("inject")] //#[test_case("inscribe")] #[test_case("intersection")] #[test_case("into")] #[test_case("join")] #[test_case("joiner")] #[test_case("keep")] #[test_case("label")] #[test_case("let")] #[test_case("list?" ; "is_list")] #[test_case("lookup")] #[test_case("loop")] #[test_case("log")] #[test_case("map")] #[test_case("max")] #[test_case("min")] #[test_case("mod")] //#[test_case("module")] #[test_case("not")] #[test_case("empty?" ; "is_empty")] #[test_case("number")] #[test_case("number?" ; "is_number")] #[test_case("odd?" ; "is_odd")] #[test_case("or")] #[test_case("over")] #[test_case("pack")] #[test_case("pad")] #[test_case("pair?" ; "is_pair")] #[test_case("pairwise")] #[test_case("partition")] #[test_case("pipe?" ; "is_pipe")] #[test_case("put")] #[test_case("prepend")] #[test_case("primrec")] #[test_case("radix")] #[test_case("range")] #[test_case("recover")] #[test_case("recur")] #[test_case("repeat")] #[test_case("rest")] #[test_case("restore")] #[test_case("retry")] #[test_case("reverse")] #[test_case("round")] #[test_case("set")] #[test_case("set?" ; "is_set")] #[test_case("shield")] #[test_case("shielddown")] #[test_case("shielddeep")] #[test_case("sink")] #[test_case("siphon")] #[test_case("skipper")] #[test_case("slice")] #[test_case("snapshot")] #[test_case("something?" ; "is_something")] #[test_case("sqrt")] #[test_case("starts?" ; "is_starts")] #[test_case("step")] #[test_case("string")] #[test_case("string?" ; "is_string")] #[test_case("spawn")] #[test_case("split")] #[test_case("swap")] #[test_case("swapdown")] #[test_case("take")] #[test_case("taker")] #[test_case("times")] #[test_case("type")] #[test_case("unassign")] #[test_case("under")] #[test_case("until")] #[test_case("unwrap")] #[test_case("update")] #[test_case("value")] #[test_case("walk")] #[test_case("when")] #[test_case("while")] #[test_case("within?" ; "is_within")] #[test_case("word")] #[test_case("word?" ; "is_word")] #[test_case("wrap")] #[test_case("xor")] #[test_case("zero?" ; "is_zero")] #[test_case("zip")] fn test_lexicon(word: &str) { crate::default::configure_platform(); let e = Environment::default(); let r = test_word(e.clone(), word.fit()); assert!(r.is_empty(), "{:?}", r); } }
1.5.7. Top level execution
We'll define the main module which reads input for the kcats interpreter process, and prints output.
We'll also define how to run unit tests.
//! The main kcats module, that executes the kcats interpreter. See [main] //mod default; use kcats::axiom; use kcats::config::configure_platform; use kcats::serialize::{self, Emit}; pub use kcats::traits::*; use kcats::types::container::environment::Environment; use kcats::types::container::error::Error; use std::io::{self, BufRead, Read, Write}; fn print_result(env: Environment) { if env.program.is_empty() { println!( "{}", serialize::auto_format(env.stack.iter().emit().as_str(), 20, 80) ); } else { println!( "stack: {}\nprogram: {}", serialize::auto_format(env.stack.iter().emit().as_str(), 20, 80), serialize::auto_format(env.program.iter().emit().as_str(), 20, 80) ) } } fn get_stdin() -> String { let mut buf = String::new(); for line in io::stdin().lock().lines() { buf.push_str(&line.unwrap()); buf.push('\n'); } buf } /// Evaluates the program in the context of the env, and handles any /// unhandled errors. Good for interactive programming. async fn repl_eval(mut env: Environment, mut program: String) -> Result<Environment, Error> { // to ensure errors are handled by the repl- so that the // user can continue with more input. program.push_str(" handle"); match serialize::parse_input(&mut env, program) { Ok(_) => Ok(axiom::eval(env).await), Err(e) => Err(e), } } // A function that takes a handle to stdin. It reads a length from // stdin, then reads that many bytes and returns a string. async fn read_input() -> Option<String> { //spawn a thread to read from stdin //println!("Reading input"); tokio::spawn(async move { let mut stdin = io::stdin().lock(); let mut buf = String::new(); if let Err(e) = stdin.read_line(&mut buf) { println!("Error reading content length {:?}", e); return None; } // parse an integer from buf let read_len = buf.trim(); //println!("Read length {}", read_len); let len = read_len.parse::<usize>().unwrap_or_default(); if len == 0 { return None; } // read len bytes from stdin let mut buf = vec![0; len]; stdin.read_exact(&mut buf).unwrap(); // convert the bytes to a string Some(String::from_utf8(buf).unwrap()) }) .await .unwrap() } async fn print_with_length(env: &Environment) { let result = serialize::auto_format(env.stack.iter().emit().as_str(), 20, 80); // first print the length of the result println!("{}\n{}", result.len(), result); } async fn print(env: &Environment) { let result = serialize::auto_format(env.stack.iter().emit().as_str(), 20, 80); println!("{}", result); } //It converts the bytes to a // string, and then evaluates that string as a kcats program. It then // prints the length of the result, and then the result itself. async fn interactive_mode() { let mut env = Environment::default(); loop { if let Some(program) = read_input().await { env = repl_eval(env, program).await.unwrap(); print_with_length(&env).await; } } } async fn repl() { let mut env = Environment::default(); loop { // Print the prompt and flush it to stdout immediately print!("kcats> "); io::stdout().flush().unwrap(); // Read a line from stdin let mut line = String::new(); io::stdin().read_line(&mut line).unwrap(); // Check if the input is empty, if so, continue to the next loop iteration if line.trim().is_empty() { continue; } env = repl_eval(env, line).await.unwrap(); print(&env).await; } } async fn read_eval_print(program: String) { let mut env = Environment::default(); match serialize::parse_input(&mut env, program) { Ok(_) => { print_result(axiom::eval(env).await); } Err(e) => { println!("Error parsing input: {:?}", e); } } } /// The main intepreter entry function that can start the interpreter /// in several different modes. #[tokio::main] async fn main() { // Set up process-wide paths and panic if this fails configure_platform(); // read command line options, to look for -i switch let args: Vec<String> = std::env::args().collect(); // if args contains "-i", read via handle_stdin if args.contains(&"-i".to_string()) { interactive_mode().await; } else if args.contains(&"-r".to_string()) { repl().await; } else if args.contains(&"-f".to_string()) { let filename = args.get(2).unwrap(); let mut file = std::fs::File::open(filename).unwrap(); let mut buf = String::new(); file.read_to_string(&mut buf).unwrap(); read_eval_print(buf).await; } else if args.contains(&"-p".to_string()) { let program = args.get(2).unwrap(); read_eval_print(program.clone()).await; } else { // otherwise, read from stdin read_eval_print(get_stdin()).await; } } // if let (Item::List(program), Item::List(expected)) = (program, expected) { // } else { // Err(Error::from("Example should be a pair")) // } // for ex in d.examples().iter() { // let e = List::try_derive(*ex).ok().unwrap(); // let p = List::try_derive(*e.get(0).unwrap()).ok().unwrap(); // let exp = List::try_derive(*e.get(1).unwrap()).ok().unwrap(); // test_example(axiom::standard_env.clone(), w, p,exp) // }.retain(|i| i.is_some()).collect::<Vec<Error>>()
1.5.8. Pipes (input/output)
Kcats will confine all i/o to pipes. You can put values into pipes and they emerge elsewhere. Words that act on pipes are the only ones that can be impure. Everything else is a value.
- Basic Types
The basic pipe contracts.
use super::error::Error; use crate::traits::*; use crate::types::container::{self as coll, SimpleTake}; use crate::types::{self, Item}; use std::pin::Pin; use std::sync::Arc; use tokio::sync::RwLock; use futures::{executor, future}; pub mod channel; #[cfg(feature = "database")] pub mod db; pub mod fs; pub mod net; pub mod standard; pub mod time; pub trait FutureTake { type Item; fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>>; } /// A pipe that accepts items. #[derive(Debug, Clone)] pub enum In { /// A pipe that takes bytes to write to a file on disk StaticFile(Arc<RwLock<fs::StaticFile>>), /// A pipe that takes bytes to write to a TCP/IP socket Socket(Arc<RwLock<net::Socket>>), /// A pipe that takes items to send through a channel to another /// part of the running program Handoff(channel::Handoff<Item>), /// A pipe that takes bytes to write to standard out Standard, } impl PartialEq for In { fn eq(&self, other: &Self) -> bool { match (self, other) { (In::StaticFile(s1), In::StaticFile(s2)) => Arc::ptr_eq(s1, s2), (In::Socket(s1), In::Socket(s2)) => Arc::ptr_eq(s1, s2), (In::Handoff(h1), In::Handoff(h2)) => h1 == h2, _ => false, } } } /// A pipe that produces items. #[derive(Debug, Clone)] pub enum Out { /// A pipe that produces bytes from a file on disk StaticFile(Arc<RwLock<fs::StaticFile>>), /// A pipe that produces bytes from a TCP/IP socket Socket(Arc<RwLock<net::Socket>>), /// A pipe that produces sockets from a TCP/IP server socket ServerSocket(Arc<RwLock<net::ServerSocket>>), /// A pipe that produces items from a channel that comes from /// another part of the program Handoff(channel::Handoff<Item>), /// A pipe that produces a dummy value after a given amount of /// time. Can be used as a timeout mechanism when waiting on /// multiple pipes at once. Timer(channel::Timer), /// A pipe that produces timestamps of the current UNIX time Time, /// A pipe that produces bytes from standard in. Standard, } impl PartialEq for Out { fn eq(&self, other: &Self) -> bool { match (self, other) { (Out::StaticFile(s1), Out::StaticFile(s2)) => Arc::ptr_eq(s1, s2), (Out::Socket(s1), Out::Socket(s2)) => Arc::ptr_eq(s1, s2), (Out::ServerSocket(s1), Out::ServerSocket(s2)) => Arc::ptr_eq(s1, s2), (Out::Handoff(h1), Out::Handoff(h2)) => h1 == h2, (Out::Time, Out::Time) => true, (Out::Standard, Out::Standard) => true, _ => false, } } } /// A bi-directional pipe that can accept and produce Items. #[derive(Debug, Clone)] pub enum Tunnel { /// A pipe that can both produce and accept bytes to read/write from a file. StaticFile(Arc<RwLock<fs::StaticFile>>), /// A pipe that can both produce and accept bytes to read/write /// from a TCP/IP socket. Socket(Arc<RwLock<net::Socket>>), /// A pipe that produces or accepts values to/from a channel that /// connects to another part of the program. Handoff(channel::Handoff<Item>), /// A pipe to standard in/out that produces/accepts bytes. Standard, } impl PartialEq for Tunnel { fn eq(&self, other: &Self) -> bool { match (self, other) { (Tunnel::StaticFile(s1), Tunnel::StaticFile(s2)) => Arc::ptr_eq(s1, s2), (Tunnel::Socket(s1), Tunnel::Socket(s2)) => Arc::ptr_eq(s1, s2), (Tunnel::Handoff(h1), Tunnel::Handoff(h2)) => h1 == h2, (Tunnel::Standard, Tunnel::Standard) => true, _ => false, } } } impl Derive<Tunnel> for Out { fn derive(t: Tunnel) -> Self { match t { Tunnel::StaticFile(f) => Out::StaticFile(f), Tunnel::Socket(s) => Out::Socket(s), Tunnel::Handoff(h) => Out::Handoff(h), Tunnel::Standard => Out::Standard, } } } impl Derive<Tunnel> for In { fn derive(t: Tunnel) -> Self { match t { Tunnel::StaticFile(f) => In::StaticFile(f), Tunnel::Socket(s) => In::Socket(s), Tunnel::Handoff(h) => In::Handoff(h), Tunnel::Standard => In::Standard, } } } impl In { /// Puts the [Item] into the pipe. Blocks if the pipe is full. pub fn put(&mut self, i: Item) -> types::Future<Result<(), Error>> { match self { In::StaticFile(f) => { let f = f.clone(); Box::pin(async move { f.write().await.put(i).await }) } In::Socket(f) => { let f = f.clone(); Box::pin(async move { f.write().await.put(i).await }) } In::Handoff(ref mut h) => Box::pin(h.put(i)), //_ => Err(Error::expected("foo")), In::Standard => standard::put(i), } } } impl FutureTake for Tunnel { /// Takes an [Item] from the tunnel, blocks if the receive side of /// the tunnel is empty. type Item = Item; /// Takes an [Item] from the pipe, blocks if the pipe is empty. fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>> { match self { Tunnel::StaticFile(f) => Box::pin(async move { f.write() .await .take_future() .await .map(|i| Some(Item::derive(i))) }), Tunnel::Socket(f) => Box::pin(async move { f.write() .await .take_future() .await .map(|i| Some(Item::derive(i))) }), Tunnel::Handoff(h) => Box::pin(h.take_future()), Tunnel::Standard => standard::take_future(), } } } impl Tunnel { /// Puts the [Item] into the tunnel, blocks if the send side of /// the tunnel is full. pub fn put(&mut self, i: Item) -> types::Future<Result<(), Error>> { match self { Tunnel::StaticFile(f) => { let f = f.clone(); Box::pin(async move { f.write().await.put(i).await }) } Tunnel::Socket(f) => { let f = f.clone(); Box::pin(async move { f.write().await.put(i).await }) } Tunnel::Handoff(ref mut h) => Box::pin(h.put(i)), Tunnel::Standard => standard::put(i), } } } impl FutureTake for Out { type Item = Item; /// Takes an [Item] from the pipe, blocks if the pipe is empty. fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>> { match self { Out::StaticFile(f) => { Box::pin( async move { f.write().await.take_future().await.map(|bs| Some(bs.fit())) }, ) } Out::Socket(f) => { Box::pin( async move { f.write().await.take_future().await.map(|bs| Some(bs.fit())) }, ) } Out::ServerSocket(f) => { Box::pin(async move { f.write().await.take_future().await.map(|s| Some(s.fit())) }) } Out::Handoff(h) => Box::pin(h.take_future()), Out::Timer(ref mut t) => Box::pin(t.take_future()), Out::Time => Box::pin(future::ready(Ok(time::Time::new() .take_simple() .map(Item::derive)))), Out::Standard => Box::pin(standard::take_future()), } } } impl crate::serialize::Display for In { fn representation(&self) -> Item { match self { In::StaticFile(f) => executor::block_on(async move { f.read().await.representation() }), In::Socket(f) => executor::block_on(async move { f.read().await.representation() }), In::Handoff(h) => h.representation(), In::Standard => standard::representation(), } } } impl crate::serialize::Display for Out { fn representation(&self) -> Item { match self { Out::StaticFile(f) => { executor::block_on(async move { f.read().await.representation() }) } Out::Socket(f) => executor::block_on(async move { f.read().await.representation() }), Out::ServerSocket(f) => { executor::block_on(async move { f.read().await.representation() }) } Out::Handoff(h) => h.representation(), Out::Timer(t) => t.representation(), Out::Time => time::representation(), Out::Standard => standard::representation(), } } } impl crate::serialize::Display for Tunnel { fn representation(&self) -> Item { match self { Tunnel::StaticFile(f) => { executor::block_on(async move { f.read().await.representation() }) } Tunnel::Socket(f) => executor::block_on(async move { f.read().await.representation() }), Tunnel::Handoff(h) => h.representation(), Tunnel::Standard => standard::representation(), } } } /* Pipes can be "closed", from either end to signal that either the * putter or taker has gone away. Sometimes the type of pipe * may not really support this concept but an implementation is * required. For example, files. When you open a file for writing and * then "close" it, that doesn't really do anything. Rust doesn't have * an explicit file close. You have to drop the reference to it, which * in kcats you can do by popping the pipe off the stack. Rust will * clean up automatically, other impls might have to reference count. * * The contract here is as follows: * 1. After calling close, put on the pipe returns an error * * 2. After calling close, take on the pipe will return still-buffered * items (if the pipe has a buffer), but once buffer is exhausted it * will return error. * * 2. Errors cannot be put into a pipe (the taker can't distinguish * between io error and an error value). To work around this, wrap the * error value in a list to quote it. Putting error into a pipe will * return an io error. * * 3. Once closed pipes cannot be ever be put into again. closed? will always * return true thereafter. * * One use case that has to be handled specially is a file we've fully * read but later someone else might write more bytes to the end. Does * the pipe close when we reach EOF? I think we might need to support * both types (a type that closes when hitting eof and one that * doesn't). The former is the "normal" use case, which will be the * default. * * These two types are basically static vs dynamic content. Either all * the content is known now, or it isn't. * */ fn closed_error(on_take: bool) -> Error { Error::create( coll::List::derive_iter([ Item::derive("close"), if on_take { "take" } else { "put" }.fit(), ]), "attempt to use closed pipe", Option::<Item>::None, ) } impl Derive<Tunnel> for Item { fn derive(t: Tunnel) -> Self { Item::Dispenser(coll::Dispenser::Tunnel(t)) } } impl Derive<Out> for Item { fn derive(t: Out) -> Self { Item::Dispenser(coll::Dispenser::Out(t)) } } impl Derive<In> for Item { fn derive(t: In) -> Self { Item::Receptacle(coll::Receptacle::In(t)) } } impl TryDerive<Item> for In { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match i { Item::Receptacle(coll::Receptacle::In(i)) => Ok(i), Item::Receptacle(coll::Receptacle::Tunnel(t)) => Ok(t.fit()), Item::Dispenser(coll::Dispenser::Tunnel(t)) => Ok(t.fit()), i => Err(Error::expected("pipe", i)), } } } impl TryDerive<Item> for Out { type Error = Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match i { Item::Dispenser(coll::Dispenser::Out(o)) => Ok(o), Item::Dispenser(coll::Dispenser::Tunnel(t)) => Ok(t.fit()), Item::Receptacle(coll::Receptacle::Tunnel(t)) => Ok(t.fit()), i => Err(Error::expected("pipe", i)), } } }
- Files
How to interact with files on disk
use crate::axiom::ItemResult; use crate::traits::*; use crate::types::container::associative as assoc; use crate::types::container::error::Error; use crate::types::*; use std::future; use std::pin::Pin; use std::ptr; use std::sync::Arc; use tokio::fs::File; use tokio::io::{AsyncReadExt, AsyncWriteExt}; use tokio::sync::RwLock; use super::{closed_error, FutureTake}; #[derive(Debug)] pub struct StaticFile { pub file: Option<File>, pub path: String, } impl PartialEq for StaticFile { fn eq(&self, other: &Self) -> bool { // Check if the 'file' fields of both structs are the same by reference ptr::eq(&self.file, &other.file) } } impl FutureTake for StaticFile { type Item = Bytes; fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>> { match self.file.as_mut() { Some(f) => { let mut bs = [0u8; 102400]; Box::pin(async move { let ct = f.read(&mut bs).await?; if ct == 0 { // EOF, no more takes since it's static Ok(None) } else { Ok(Some(bs[0..ct].to_vec().fit())) } }) } None => Box::pin(future::ready(Err(closed_error(false)))), } } } impl StaticFile { pub fn put<'a>( &'a mut self, i: Item, ) -> Pin<Box<dyn std::future::Future<Output = Result<(), Error>> + Send + 'a>> { match self.file.as_mut() { Some(f) => { let b = Bytes::try_derive(i); match b { Ok(bs) => Box::pin(async move { f.write_all(&bs).await.map_err(|e| e.into()) }), Err(e) => Box::pin(future::ready(Err(e))), } } None => Box::pin(future::ready(Err(closed_error(false)))), } } } impl crate::serialize::Display for StaticFile { fn representation(&self) -> Item { assoc::Association::derive_iter([ ("type".fit(), "tunnel".fit()), ( "values".fit(), assoc::Association::derive_iter([("type".fit(), "bytes".fit())]).fit(), ), ( "to".fit(), assoc::Association::derive_iter([("file".fit(), self.path.clone().fit())]).fit(), ), ]) .fit() } } pub fn file_in(i: Item) -> ItemResult { let path = String::try_derive(i)?; let file = std::fs::File::options() .read(true) .write(true) .create_new(true) .open(path.clone())?; Ok(super::In::StaticFile(Arc::new(RwLock::new(StaticFile { file: Some(File::from_std(file)), path, }))) .fit()) } pub fn file_out(i: Item) -> ItemResult { let path = String::try_derive(i)?; let file = std::fs::File::open(path.clone())?; Ok(super::Out::StaticFile(Arc::new(RwLock::new(StaticFile { file: Some(File::from_std(file)), path, }))) .fit()) } impl Derive<StaticFile> for Item { fn derive(f: StaticFile) -> Self { super::Out::StaticFile(Arc::new(RwLock::new(f))).fit() } }
- Network
How to interact with the network (TCP/IP sockets)
use crate::axiom::ItemResult; use crate::list; use crate::traits::*; use crate::types::container as cont; use crate::types::container::associative as assoc; use crate::types::container::error::Error; use crate::types::container::pipe::{self, FutureTake}; use crate::types::number::Int; use crate::types::{self, Bytes, Item}; use futures::future::FutureExt; use std::future::{self}; use std::net::{Ipv4Addr, SocketAddrV4}; use std::pin::Pin; use std::ptr; use std::str::FromStr; use std::sync::Arc; use tokio::io::{AsyncReadExt, AsyncWriteExt}; use tokio::net::{TcpListener, TcpStream}; use tokio::sync::RwLock; #[derive(Debug)] pub struct Socket { pub socket: TcpStream, pub addr: (String, u16), } impl PartialEq for Socket { fn eq(&self, other: &Self) -> bool { // Check if the 'socket' fields of both structs are the same by reference ptr::eq(&self.socket, &other.socket) } } impl FutureTake for Socket { type Item = Bytes; fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>> { let mut bs = [0u8; 1024]; Box::pin(async move { let n = self.socket.read(&mut bs).await?; if n == 0 { Ok(None) } else { Ok(Some(bs[..n].to_vec())) } }) } } impl Socket { pub fn put<'a>( &'a mut self, i: Item, ) -> Pin<Box<dyn std::future::Future<Output = Result<(), Error>> + Send + 'a>> { //println!("Putting {:?}", i); let b = types::Bytes::try_derive(i); match b { Ok(bs) => { Box::pin(async move { self.socket.write_all(&bs).await.map_err(|e| e.into()) }) } Err(e) => Box::pin(future::ready(Err(e))), } } } impl crate::serialize::Display for Socket { fn representation(&self) -> Item { assoc::Association::derive_iter([ ("type".fit(), "tunnel".fit()), ("realm".fit(), "tcp".fit()), ("address".fit(), self.addr.0.to_string().fit()), ("port".fit(), self.addr.1.to_string().fit()), ]) .fit() } } // Server sockets #[derive(Debug)] pub struct ServerSocket { pub socket: TcpListener, } impl PartialEq for ServerSocket { fn eq(&self, other: &Self) -> bool { // Check if the 'socket' fields of both structs are the same by reference ptr::eq(&self.socket, &other.socket) } } impl FutureTake for ServerSocket { type Item = Socket; fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>> { Box::pin(async move { let (socket, addr) = self.socket.accept().await?; Ok(Some(Socket { socket, addr: (addr.ip().to_string(), addr.port()), })) }) } } impl crate::serialize::Display for ServerSocket { fn representation(&self) -> Item { assoc::Association::derive_iter([ ("type".fit(), "pipe".fit()), ( "serversocket".fit(), "todo: fix serversocket local addr async issue".fit(), //Item::String(self.socket.lock().await.local_addr().unwrap().to_string()), ), ]) .fit() } } fn socket_addr(i: Item, j: Item) -> Result<SocketAddrV4, Error> { //println!("socket: {:?} {:?}", i, j); let addr = Ipv4Addr::from_str(String::try_derive(j)?.as_str())?; let port = Int::try_derive(i)? as u16; Ok(SocketAddrV4::new(addr, port)) } fn host_addr(i: Item, j: Item) -> Result<(String, u16), Error> { //println!("socket: {:?} {:?}", i, j); let addr = String::try_derive(j)?; let port = Int::try_derive(i)? as u16; Ok((addr, port)) } pub fn server_socket(i: Item, j: Item) -> types::Future<ItemResult> { match socket_addr(i, j) { Ok(addr) => { Box::pin(TcpListener::bind(addr).map(|l| { Ok(super::Out::ServerSocket(Arc::new(RwLock::new(ServerSocket { socket: l.unwrap(), }))) .fit()) })) } Err(e) => Box::pin(future::ready(Err(e))), } } pub fn socket(i: Item, j: Item) -> types::Future<ItemResult> { match host_addr(i, j) { Ok(addr) => Box::pin(TcpStream::connect(addr.clone()).map(move |s| { Ok(super::Tunnel::Socket(Arc::new(RwLock::new(Socket { socket: s.unwrap(), addr, }))) .fit()) })), Err(e) => Box::pin(future::ready(Err(e))), } } // pub fn server_socket(env: Environment) -> environment::Future { // let addr = env.pop(); // let inner_env = Environment::try_derive(tos); // match inner_env { // Ok(inner) => Box::pin(eval_step(inner).map(|inner_next| env.push(Item::Env(inner_next)))), // Err(e) => env.push(Item::Error(e)).fit(), // } // } impl From<std::net::AddrParseError> for Error { fn from(err: std::net::AddrParseError) -> Error { Error::create(list!("addrparse"), &err.to_string(), Option::<Item>::None) } } impl Derive<Socket> for Item { fn derive(ss: Socket) -> Item { Item::Dispenser(cont::Dispenser::Tunnel(pipe::Tunnel::Socket(Arc::new( RwLock::new(ss), )))) } }
- Time
use crate::types::container::{associative as assoc, SimpleTake}; use crate::types::number::Int; use crate::types::*; use std::time::{SystemTime, UNIX_EPOCH}; pub struct Time; impl Time { pub fn new() -> Self { Time } } impl SimpleTake for Time { type Item = Int; fn take_simple(&mut self) -> Option<Self::Item> { let t = SystemTime::now() .duration_since(UNIX_EPOCH) .unwrap() .as_millis() as Int; Some(t) } } pub fn representation() -> Item { assoc::Association::derive_iter([ ("type".fit(), "out".fit()), ("from".fit(), "systemtime".fit()), ( "values".fit(), assoc::Association::derive_iter([ ("type".fit(), "integer".fit()), ("units".fit(), "milliseconds".fit()), ]) .fit(), ), ]) .fit() }
- Standard in/out
use crate::types::container::associative as assoc; use crate::types::container::error::Error; use crate::types::{self, *}; use std::future; use std::io::{self, Read, Write}; pub fn take_future<'a>( ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Item>, Error>> + Send + 'a>> { let mut buf = [0u8]; let n = io::stdin().read(&mut buf); let f = match n { Ok(0) => Ok(None), Ok(n) => Ok(Some(buf[..n].to_vec().fit())), Err(e) => Err(e.into()), }; Box::pin(future::ready(f)) } pub fn put(i: Item) -> types::Future<Result<(), Error>> { let bs = Bytes::try_derive(i); match bs { Ok(bs) => { let f = io::stdout().write(&bs); Box::pin(future::ready(f.map_err(|e| e.into()).map(|_| ()))) } Err(e) => Box::pin(future::ready(Err(e))), } } pub fn representation() -> Item { assoc::Association::derive_iter([ ("type".fit(), "tunnel".fit()), ("peer".fit(), "standard".fit()), ]) .fit() }
- Channels
Implement the
handoff
typeuse crate::axiom; use crate::traits::*; use crate::types::container as coll; use crate::types::container::error::Error; use crate::types::container::pipe::FutureTake; use crate::types::container::{associative as assoc, environment::Environment, error, pipe}; use crate::types::number::Int; use crate::types::{self, Item}; use flume; use std::future; use std::pin::Pin; use std::ptr; use std::sync::atomic::{AtomicUsize, Ordering}; use std::sync::Arc; use tokio::task::JoinHandle; use tokio::time::{sleep, Duration}; #[derive(Debug, Clone)] // Use Option because we want to be able to drop senders/receivers to // close the channel pub struct Handoff<T> { pub receiver: Option<flume::Receiver<T>>, pub sender: Option<flume::Sender<T>>, pub bidirectional: bool, pub id: usize, } impl<T> PartialEq for Handoff<T> { fn eq(&self, other: &Self) -> bool { match (&self.receiver, &other.receiver, &self.sender, &other.sender) { (Some(sr), Some(or), Some(ss), Some(os)) => ptr::eq(&sr, &or) && ptr::eq(&ss, &os), _ => false, } } } static ID: AtomicUsize = AtomicUsize::new(0); impl FutureTake for Handoff<Item> { type Item = Item; fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>> { // println!( // "Taking from channel: {:?} on {:?}", // self, // thread::current().id() // ); if !self.bidirectional { self.close_put(); } if let Some(ch) = self.receiver.clone() { Box::pin(async move { ch.recv_async().await.map(Some).or_else(|_| Ok(None)) }) } else { Box::pin(future::ready(Ok(None))) } } } impl Handoff<Item> { pub fn new(bidirectional: bool) -> Handoff<Item> { let (sender, receiver) = flume::bounded::<Item>(0); let id = ID.fetch_add(1, Ordering::Relaxed); Handoff::<Item> { sender: Some(sender), receiver: Some(receiver), bidirectional, id, } } pub fn put(&mut self, i: Item) -> types::Future<Result<(), error::Error>> { // println!( // "Putting into channel: {} into {:?} on {:?}", // i.clone(), // self, // thread::current().id() // ); if !self.bidirectional { self.close_take() }; if let Some(ch) = self.sender.clone() { if axiom::is_truthy(i.clone()) { Box::pin(async move { ch.send_async(i) .await .map_err(|_| pipe::closed_error(false)) }) } else { // If we're putting 'nothing', that indicates end of // input, so we drop the sender. self.close_put(); Box::pin(future::ready(Ok(()))) } } else { Box::pin(future::ready(Err(pipe::closed_error(false)))) } } pub fn close_take(&mut self) { if self.receiver.is_some() { //println!("Dropping receiver"); self.receiver = None; } } pub fn close_put(&mut self) { if self.sender.is_some() { //println!("Dropping sender"); self.sender = None; } } } impl crate::serialize::Display for Handoff<Item> { fn representation(&self) -> Item { let t = match (&self.sender, &self.receiver) { (Some(_), Some(_)) => "tunnel", (Some(_), None) => "in", (None, Some(_)) => "out", (None, None) => "closed", }; assoc::Association::derive_iter([ ("type".fit(), t.fit()), ("handoff".fit(), (self.id as Int).fit()), ]) .fit() } } pub fn handoff(mut env: Environment) -> types::Future<Environment> { env.pop_prog(); env.push(pipe::Tunnel::Handoff(Handoff::new(false))); env.fit() } impl From<flume::RecvError> for error::Error { fn from(_: flume::RecvError) -> Self { pipe::closed_error(false) // todo fix this } } impl From<flume::SendError<Item>> for error::Error { fn from(_: flume::SendError<Item>) -> Self { pipe::closed_error(false) } } enum ChannelOp<T> { Send(Arc<flume::Sender<T>>, T), Receive(Arc<flume::Receiver<T>>), } /// Given a list of pipes (channels) on top of stack, use flume's /// selector to choose the next ready pipe. A pipe means it's a /// receive, a pipe/item pair means it's a send. pub fn select(i: Item) -> axiom::ItemResult { let l = coll::List::try_derive(i)?; let original = l.clone(); //Create references out of any [pipe item] pairs let lr = l .iter() .cloned() .map(move |i| match i { Item::Dispenser(coll::Dispenser::Out(pipe::Out::Handoff(p))) => { Ok(ChannelOp::Receive(Arc::new(p.receiver.unwrap()))) } Item::Dispenser(coll::Dispenser::Tunnel(pipe::Tunnel::Handoff(p))) => { Ok(ChannelOp::Receive(Arc::new(p.receiver.unwrap()))) } // Handle timeout channels - start the timer and add receive op Item::Dispenser(coll::Dispenser::Out(pipe::Out::Timer(t))) => { let mut t = t.clone(); t.start(); Ok(ChannelOp::Receive(Arc::new(t.receiver.unwrap()))) } i => { let l = coll::List::try_derive(i.clone())?; let p = l.front(); let i = l.get(1); match (p, i) { (Some(p), Some(i)) => match (p, i) { (Item::Receptacle(coll::Receptacle::In(pipe::In::Handoff(p))), i) => Ok( ChannelOp::Send(Arc::new(p.sender.clone().unwrap()), i.clone()), ), ( Item::Receptacle(coll::Receptacle::Tunnel(pipe::Tunnel::Handoff(p))), i, ) => Ok(ChannelOp::Send( Arc::new(p.sender.clone().unwrap()), i.clone(), )), (p, _i) => Err(error::Error::expected("handoff", p.clone())), }, _ => Err(error::Error::short_list(2)), } } }) .collect::<Result<Vec<ChannelOp<Item>>, error::Error>>()?; let (res, idx) = { let mut selector = flume::Selector::new(); // loop over the operations and add them to the selector. Each one // returns the original index in the list, so we can use that to // fetch the original item from the list. for (idx, item) in lr.iter().enumerate() { let idx_clone = idx; match item { ChannelOp::Receive(r) => { selector = selector.recv(&r, move |i| { (i.map(Some).map_err(error::Error::from), idx_clone) }); } ChannelOp::Send(s, i) => { selector = selector.send(&s, i.clone(), move |i| { (i.map(|_| None).map_err(error::Error::from), idx) }); } } } selector.wait() }; let selected = original.get(idx).unwrap().clone(); match res { Ok(Some(i)) => { let l: Item = coll::List::derive_iter(vec![selected, i]).fit(); Ok(l) } Ok(None) => Ok(selected), Err(e) => Err(e), } } impl TryDerive<Item> for Handoff<Item> { type Error = error::Error; fn try_derive(i: Item) -> Result<Self, Self::Error> { match i { Item::Dispenser(coll::Dispenser::Out(pipe::Out::Handoff(p))) => Ok(p), Item::Receptacle(coll::Receptacle::In(pipe::In::Handoff(p))) => Ok(p), Item::Receptacle(coll::Receptacle::Tunnel(pipe::Tunnel::Handoff(p))) => Ok(p), Item::Dispenser(coll::Dispenser::Tunnel(pipe::Tunnel::Handoff(p))) => Ok(p), i => Err(error::Error::expected("handoff", i)), } } } // drop the receiver side of the handoff and return the handoff item pub fn sender(i: Item) -> axiom::ItemResult { let mut h = Handoff::try_derive(i)?; h.close_take(); Ok(Item::Receptacle(coll::Receptacle::In(pipe::In::Handoff(h)))) } // drop the sender side of the handoff and return the handoff item pub fn receiver(i: Item) -> axiom::ItemResult { let mut h = Handoff::try_derive(i)?; h.close_put(); Ok(Item::Dispenser(coll::Dispenser::Out(pipe::Out::Handoff(h)))) } #[derive(Debug)] pub struct Timer { receiver: Option<flume::Receiver<Item>>, handle: Option<JoinHandle<()>>, duration: Duration, } // Cloning a timeout makes a new one, clears state impl Clone for Timer { fn clone(&self) -> Self { Self { receiver: None, handle: None, duration: self.duration, } } } impl FutureTake for Timer { type Item = Item; fn take_future<'a>( &'a mut self, ) -> Pin<Box<dyn std::future::Future<Output = Result<Option<Self::Item>, Error>> + Send + 'a>> { self.start(); let receiver = self.receiver.clone().unwrap(); Box::pin(async move { //println!("Receiving"); receiver.recv_async().await.map(Some).or_else(|_| Ok(None)) }) } } impl Timer { fn new(duration: Duration) -> Timer { Timer { receiver: None, handle: None, duration, } } fn start(&mut self) { if self.handle.is_none() { let (sender, receiver) = flume::bounded(1); let duration = self.duration; self.receiver = Some(receiver); self.handle = Some(tokio::spawn(async move { sleep(duration).await; //TODO handle error condition on send let _ = sender.send(Item::default()); })); } } } impl Derive<Timer> for Item { fn derive(t: Timer) -> Self { Item::Dispenser(coll::Dispenser::Out(pipe::Out::Timer(t))) } } impl crate::serialize::Display for Timer { fn representation(&self) -> Item { assoc::Association::derive_iter([ ("type".fit(), "pipe".fit()), ("timeout".fit(), (self.duration.as_millis() as Int).fit()), ]) .fit() } } pub fn timer(i: Item) -> axiom::ItemResult { let ms = Int::try_derive(i)?; //TODO: check for negative values Ok(Timer::new(Duration::from_millis(ms as u64)).fit()) }
- Database
use crate::axiom; use crate::config; use crate::list; use crate::traits::*; use crate::types::container::{self as coll, associative as assoc, error::Error}; use crate::types::number::Number; use crate::types::{self, Item}; use rusqlite::types::{ToSql, ToSqlOutput, Value, ValueRef}; use rusqlite::{params, Connection, Error as DBError}; use std::path::PathBuf; use uuid; pub struct Db { conn: Connection, } impl Db { pub fn new() -> Result<Self, DBError> { let db_file = config::PlatformConfig::get() .unwrap() .database .ok_or(DBError::InvalidPath(PathBuf::from("".to_string())))?; let conn = Connection::open(db_file.as_path())?; Ok(Db { conn }) } pub fn query(&self, query: &str, params: Vec<(String, Item)>) -> axiom::ItemResult { let mut stmt = self.conn.prepare(query)?; // Convert Vec<Box<dyn ToSql>> to Vec<&dyn ToSql> let params_refs: Vec<(&str, &dyn ToSql)> = params .iter() .map(|(s, b)| (s.as_str(), b as &dyn ToSql)) .collect(); let rows = stmt.query_and_then(params_refs.as_slice(), |row| { (0..row.as_ref().column_count()) .map(|column_index| { let column_name = row.as_ref().column_name(column_index).unwrap().to_string(); let column_value: ValueRef = row.get_ref_unwrap(column_index); Item::try_derive(column_value) .map(|v| (assoc::KeyItem::derive(column_name.as_str()), v)) }) .my_collect::<Result<assoc::Association, _>>() .map(Item::derive) })?; Ok(rows.my_collect::<Result<coll::List, _>>()?.fit()) } fn insert_attribute(&self, id: uuid::Uuid, attribute: Item, value: Item) -> axiom::ItemResult { let q = "INSERT INTO EAV (entity, attribute, value) VALUES (?, ?, ?)"; match value { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Associative(a))) => { let sub_id = uuid::Uuid::new_v4(); self.insert_item(a.fit(), sub_id)?; self.conn.execute(q, params![id, attribute, sub_id])?; } Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Set(s))) => self.insert_iter( Some((id, attribute)), s.iter().map(|i| Item::derive(i.clone())), )?, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::List(l))) => { self.insert_iter(Some((id, attribute)), (*l).clone())? } i => { self.conn.execute(q, params![id, attribute, i])?; } } Ok(Item::default()) } pub fn insert_iter<I>(&self, parent_link: Option<(uuid::Uuid, Item)>, l: I) -> Result<(), Error> where I: IntoIterator<Item = Item>, { for v in l { if is_value(&v) { match parent_link { Some(parent_link) => { self.insert_attribute(parent_link.0, parent_link.1.clone(), v.clone())?; return Ok(()); } None => { return Err(Error::expected("parent-link", v.clone())); } } } else { let sub_id = uuid::Uuid::new_v4(); self.insert_item(v.clone(), sub_id)?; if let Some(parent_link) = parent_link.clone() { self.insert_attribute( parent_link.0, parent_link.1.clone(), sub_id.into_bytes().to_vec().fit(), )?; } } } Ok(()) } pub fn insert_item(&self, i: Item, id: uuid::Uuid) -> axiom::ItemResult { let s = coll::Sized::try_derive(i)?; match s { coll::Sized::Associative(a) => { for (k, v) in a.to_iter() { //println!("Insert! {:?} {:?}", k, v); let w: types::Word = k.try_fit()?; self.insert_attribute(id, Item::Word(w), v)?; } } coll::Sized::List(l) => { self.insert_iter(None, (*l).clone())?; } s => return Err(Error::expected("db-object", s)), } Ok(Item::default()) } } pub fn is_value(i: &Item) -> bool { match i { Item::Dispenser(coll::Dispenser::Sized(coll::Sized::String(_))) => true, Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(_))) => true, Item::Number(_) => true, Item::Word(_) => true, Item::Char(_) => true, _ => false, } } pub fn query(q: Item, params: Item) -> axiom::ItemResult { let query: String = q.try_fit()?; // Needs to be association, and instead of a slice of ToSql, // we need &[(&str, &dyn ToSql)] (slice of pairs of strings and ToSql) let params: assoc::Associative = params.try_fit()?; let mut boxed_params: Vec<(String, Item)> = Vec::new(); for (k, v) in params.to_iter() { match k { assoc::KeyItem::String(s) => boxed_params.push((s, v)), k => return Err(Error::expected("string", k)), } } let db = Db::new()?; db.query(&query, boxed_params) } pub fn insert_object(i: Item) -> axiom::ItemResult { let db = Db::new()?; let id = uuid::Uuid::new_v4(); db.insert_item(i, id) } impl TryDerive<ValueRef<'_>> for Item { type Error = Error; fn try_derive(value: ValueRef) -> Result<Self, Self::Error> { match value { ValueRef::Integer(i) => Ok(Item::Number(Number::Int(i))), ValueRef::Real(f) => Ok(Item::Number(Number::Float(f))), ValueRef::Text(t) => decode_string(String::from_utf8_lossy(t).into_owned()), ValueRef::Blob(b) => Ok(Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes( b.to_vec(), )))), ValueRef::Null => Ok(Item::default()), } } } /// Since sqlite doesn't have separate string/word/char types, we /// store them all as String, and encode a prefix to note which type /// it should be when decoded. fn decode_string(s: String) -> axiom::ItemResult { if let Some(w) = s.strip_prefix("w|") { Ok(Item::Word(w.fit())) } else if let Some(s) = s.strip_prefix("s|") { Ok(Item::Dispenser(coll::Dispenser::Sized( coll::Sized::String(s.to_string()), ))) } else if let Some(c) = s.strip_prefix("c|") { let char_seq = c; if char_seq.chars().count() == 1 { Ok(Item::Char(char_seq.chars().next().unwrap())) } else { Err(Error::expected("char", char_seq)) } } else { Err(Error::expected("string", s)) } } enum EncodeAs { String(String), Char(types::Char), Word(types::Word), } impl EncodeAs { fn encode(self: EncodeAs) -> ToSqlOutput<'static> { ToSqlOutput::Owned(Value::Text(match self { EncodeAs::String(s) => format!("s|{}", s), EncodeAs::Word(w) => format!("w|{}", String::derive(w)), EncodeAs::Char(c) => format!("c|{}", String::from(c)), })) } } impl rusqlite::ToSql for Item { fn to_sql(&self) -> Result<ToSqlOutput<'_>, DBError> { match self { Item::Number(Number::Int(i)) => i.to_sql(), Item::Number(Number::Float(f)) => f.to_sql(), Item::Char(c) => Ok(EncodeAs::Char(*c).encode()), Item::Word(w) => Ok(EncodeAs::Word(w.clone()).encode()), Item::Dispenser(coll::Dispenser::Sized(coll::Sized::String(s))) => { Ok(EncodeAs::String(s.clone()).encode()) } Item::Receptacle(coll::Receptacle::Sized(coll::Sized::String(s))) => { Ok(EncodeAs::String(s.clone()).encode()) } Item::Dispenser(coll::Dispenser::Sized(coll::Sized::Bytes(b))) => b.to_sql(), Item::Receptacle(coll::Receptacle::Sized(coll::Sized::Bytes(b))) => b.to_sql(), _ => todo!("convert item variants to sql values"), } } } impl From<rusqlite::Error> for Error { fn from(error: rusqlite::Error) -> Self { Error::create( list!("io"), error.to_string().as_str(), Option::<Item>::None, ) } }
2. Issues
2.1. INPROGRESS Interactive mode tools
run with kcats -i
for interactive, where you get a repl-like
prompt. Each prompt accepts kcats items as input, and updates the
state accordingly. There are special commands to print the current
state, clear it, write to file, etc.
2.1.1. CANCELED Only print the changed part of the stack
- State "CANCELED" from "TODO"
I think it's too complicated for it to be clear exactly what changed. How can we tell if we replaced a stack item or added one?
2.1.2. TODO Emacs keybindings to send common stack ops
- swap / swapdown
- clear ([] evert drop)
- clone
- snapshot?
- drop
- sink / float
2.2. INPROGRESS Implement pipes stdlib
2.2.1. DONE Write to a file
[[file "/tmp/bar4"]] pipe-in ["hello world!" "Nice to meet you!" "My name is kcats"] ["\n" join bytes put] step
[[asked [pipe]] [unwound [["Nice to meet you!" "My name is kcats"] ["\n" join bytes put] step]] [type error] [reason "type mismatch"]] [[type pipe] [file "/tmp/bar4"]]
[[file "/tmp/bar101r7"]] pipe-in "hello world!" bytes put
[[type pipe] [file "/tmp/bar101r7"]]
[[file "/tmp/bar101r7"]] pipe-out take string
"hello world!" [[type pipe] [file "/tmp/bar101r7"]]
2.2.2. DONE Read from a file
"" [string join] [[file "/tmp/bar2"]] pipe-out collect
stack: [[[reason "type mismatch"] [asked [pipe]] [type error]] [[file "/tmp/bar2"] [type pipe]] ""] program: [swap [string join] dip [closed? not] shield [take swap [string join] dip [closed? not] shield] loop drop]
dictionary [collect spec] lookup
[[[type error] [reason "word is not defined"] [asked [fail]]] "Lookup attempted on non association value" [spec] [[definition [swap [take swap] swap put [dip] join [[closed? not]] dip while drop]] [spec [[pipe program] [item]]]]]
2.2.3. DONE Close a pipe
[[file "/tmp/foopytoop"]] pipe-in "foo" bytes put close "bar" bytes put
[[type pipe] [file "/tmp/foopytoop"]]
2.2.4. DONE Serialize pipes with something sane
Maybe they can't be easily round-tripped, but at least we can print something reasonable that will tell human eyes what it is. something like[[type pipe-in] [file "/tmp/foo"]]
2.2.5. DONE Sockets
2.2.5.1. DONE Server Sockets
[[type ip-host] [address "127.0.0.1"] [port 11211]] pipe-out
socket: Int(11211) String("127.0.0.1") [[type pipe] [serversocket todo: fix serversocket local addr async issue]]
"127.0.0.1" 12345 serversocket
socket: Int(12345) String("127.0.0.1") [[type pipe] [serversocket todo: fix serversocket local addr async issue]]
[[type ip-host] [address "127.0.0.1"] [port 11211]] pipe-out ;; server socket take ;; accept connection by taking a socket out of the pipe "foo\n" bytes put ;; write a message to the socket take string ;; get a message from the socket [drop ;; close the socket drop] ;; close the server socket dip
[[asked [string]] [unwound [take "foo\n" bytes put take string [drop drop] dip]] [type error] [reason "type mismatch"]]
2.2.5.2. DONE Sockets
2.2.5.3. CANCELED Assemble is broken when reading files
- State "CANCELED" from "INPROGRESS"
I think it's because closed?
is broken.
"" [string join] [[file "bar"]] pipe-out assemble
"" [string join] [[file "bar"]] pipe-out take drop take drop closed?
checking file closed false Got 3 bytes checking file closed false Got 0 bytes Closing! checking file closed false [] [string join] ""
I see the problem. When we clone the pipe, we also clone the closed
boolean and we shouldn't be doing that. There should only be one copy
of that. The entire struct should be in an Arc<Mutex> and not just the
file field. And when we modify the boolean, we shouldn't
2.2.6. DONE Convert In/Out traits to enums in pipes modules
Enums seem to work well elsewhere, and since pipes are also a closed set, we can use them here too.
I don't think there will ever be user-created pipe types as it would have to be done in rust and not in kcats.
2.2.7. DONE Composable transforms
There should be some way to compose transforms in a pipe. For example, we can have a pipe that when you put bytes in it, it gets written to a certain file on disk. But what we really want is that we put bytes into it, and they get compressed with lz4 before being written to disk.
I suppose pump could take an optional transducer-like thing, and those could be composable. The transformations I'm thinking of generally aren't going to be i/o, it's pure computation. Actually I guess any pipe could take an optional transform. Clojure.core.async channels do this.
Maybe the first thing to do is implement transducers?
2.2.7.1. DONE Siphon from one pipe to another
A nice primitive would be a word that takes a program (the program should expect an item on ToS and it should leave a transformed item) and two pipes, and takes from one pipe, runs the program, and puts the result back into the 2nd pipe. It should close the output pipe when the input pipe closes. Should work with generators as input.
This should all work ok except for when programs somewhere in the generator stack need access to items beneath the generator and we don't know how to get to them.
The obvious solution to that is to include the needed values in the program before giving it to the generator. Then the values will be in a known place on the stack.
This little program will siphon directly from a generator to a receptacle:
integers 5 taker [] ;; receptacle [] ;; placeholder that gets dropped (next iteration it will hold a ;; copy of the last element which is only needed to check if the ;; loop continues and can be dropped after) [empty?] ;; stop when generator returns nothing [drop ;; the last value [generate clone] dip sink [put] dip] until drop ;; drop the now-empty dispenser
[0 1 2 3 4 []] [[positive?] [dec [generate] dive] [[]] if] 0 [inc clone] 4
integers 5 taker [] siphon
[[type error] [actual [[positive?] [dec [generate] dive] [[]] if]] [asked [generator]] [unwound [siphon]] [reason "type mismatch"] [handled true]] [] [[positive?] [dec [generate] dive] [[]] if] 5 [inc clone] -1
And since pipes can have generator layers put on top of them, I think we're done.
2.2.8. CANCELED Filled pipes
Mostly for testing purposes, takes a list and creates a buffered pipe that offers list items until the list is exhausted and then returns pipe closed errors.
[1 2 3] filled take
1 [[type pipe] [filled todo: id-or-hash here]]
2.2.9. INPROGRESS Object pipes
2.2.9.1. INPROGRESS Generator re-splitting
- State "INPROGRESS" from "TODO"
These pipes should send serialized kcats objects and each put/take should transfer 1 object. Maybe use protocol buffers or similar
This could be done using a network pipe, and an assemble function that pulls byte chunks and builds objects when there are enough bytes for one object, and puts them into a handoff pipe.
This should be possible to do entirely in kcats, similar to how the
interactive mode works. Send a length, then send that number of
bytes. Then the receiving transform can track how many bytes it has
left to receive and the partial encoded item it's got so far. It takes
the next chunk, knocks off that many bytes (if it's more than needed
for that item), and calls read
. If it's still not enough for the full
item, append to the partial encoded item and decrease the 'bytes
needed' number.
This mechanism of using kcats serialization means we can't send associations and sets over the wire as-is. We'd have to send them as a list and convert them at the other end.
Let's see if we can make an object serializer that sends the length first (separated by \n).
[1 2 3] emit bytes [count] shield string "\n" join bytes swap join
#b64 "NwpbMSAyIDNd"
That's pretty easy! The trickier part is a deserializer where we don't know how many bytes we're going to get in a chunk.
First we might need a generator that divides into lines. A generic splitter generator would do most of the work.
"foo\nbar\nbaz\n\n" [take] "\n" ;; \f [empty] shield [[[generate] divedown [clone [put] dip] bail] [[[] [drop swap ends? not]] [execute] every?] [drop] prime drop [swap ends?] [[[count] shield] dive [[count] shield] dive swap - [0] dip slice] when [empty] shield swap] collect
["foo" "bar" "baz"] [[[generate] divedown [clone [put] dip] bail] [[[] [drop swap ends? not]] [execute] every?] [drop] prime drop [swap ends?] [[[count] shield] dive [[count] shield] dive swap - [0] dip slice] when [empty] shield swap] "" "\n" [take] ""
"foo\nbar\nbaz\n\n" [take] "\n" ;; \f split collect
["foo" "bar" "baz"] [[[generate] divedown [clone [put] dip] bail] [[[] [drop swap ends? not]] [execute] every?] [drop] prime drop [swap ends?] [[[count] shield] dive [[count] shield] dive swap - [0] dip slice] when [empty] shield swap] "" "\n" [take] ""
Ok this works but ultimately what we need is resplit
which takes a
list of sized (all the same type presumably) and joins and splits
piece by piece.
We could just create something like atomize
that takes a generator of
lists, and emits single items.
["foo\n" "bar\nba" "z\n\n"] [take] [] [[] [take] [drop generate take] if]
"foo\nbar\nbaz\n\n" [[] [take] [drop generate take] if] [] [take] []
Ok the atomize
is still handy but what I'm going to do is implement
splitter, that takes a string and emits fields. Then i can use that
generator within a re-split chunks generator, that keeps partial
content as state.
So here's that split gen:
;"foo\nbar\nbaz\n\n" [take] "\n" ;[1 2 3 2 5] [take] [2] ["foo\nbeep" "bar\nba" "z\n\n"] [take] "\n" [empty] shield ;"foo\nbeep" ;; while the state has no separator, pull chunks into it [yes [[[drop swap contains? not] ;; state doesn't have sep? []] ;; last item still something [execute] every?] [drop ;; the previous chunk [generate] divedown clone [[join] dip] bail] while drop ;; now call the split generator internally wrap [take] put [clone] dive put reverse [split execute [dropdown] 3 times] inject unwrap swap ] collect
["foo" "beepbar" "baz"] [yes [[[drop swap contains? not] []] [execute] every?] [drop [generate] divedown clone [[join] dip] bail] while drop wrap [take] put [clone] dive put reverse [split execute [dropdown] 3 times] inject unwrap swap] "" "\n" [take] []
This is great and all, but maybe not quite what we need for object serialization. Objects can have \n embedded within strings, so that character doesn't necessarily mean "end of object". It's just used to separate the byte count from the content.
I think we can implement this pretty directly and easily, especially if we don't have to account for the case where the count is split across chunks. We can have a generator with the following state: a count of chars to read, current content.
Generate, read the count, then loop until there is more content left in buffer than the count says to expect. Slice off [count] characters from the buffer and return it if there's enough in the buffer, otherwise generate and repeat.
;["5\n[1 2]13\n[ooba " "bazquu]11\nboobooboobo"] ["3\n" "fpp4\nfoo"] ;[bytes] map [take] ;[string] each "" 0 [[[complete? [swap count <=]] [readcount [drop [take] "\n" split generate [[drop] 4 times] dip [read first] bail 0 or]]] [[[[[generate] dive] [[[] [\newline contains? not]] [execute] every?] [join [generate] dive] prime join] dip [swap \newline contains?] [readcount] when] ;[dump [generate] dive [] [join] [drop] if readcount] [dump complete? not] [[generate] divedown swap [join] dip] prime] let dump cut 0 swap] [read first] each collect
2.2.9.2. TODO Another take on re-splitting
A more generic version of the re-splitter:
Take as input a buffer (probably of Bytes), and a function of that buffer that returns 0 or more objects and a new buffer that's the same or smaller (with parsed objects sliced out). It has additional state of a buffer of parsed objects (used when we find say, 10 words in a single byte chunk, if we're trying to parse words - we can't emit 10 objects at once so we need to save the other 9).
Then the resplitting generator becomes:
- take from parent generator. If empty, emit from the object buffer (if any). If non-empty, join with previous byte buffer state
- run the splitter function on the byte buffer, join with previous object buffer state, emit one object
2.2.10. DONE Time pipe
Each take from the pipe return the current unix time in ms. Should be a "singleton" - probably using Box::leak, so that we can insert a copy of this pipe whenever we want and it's always a reference to the same object. Might be an Arc for compatibility even though we don't need to ref count. (But I suspect we don't need the Arc).
timestamps take
1687273991929 [[from systemtime] [values [[type integer] [units milliseconds]]] [type out]]
2.2.11. DONE stdin/stdout pipes
Should also be singleton. Should it always be a tunnel or should we allow separate access to in or out?
standard "foo" bytes put
foo[[type tunnel] [peer standard]]
Stdin is not tested, since currently the interpreter reads the program from stdin. May need to change that (read the program from filesystem and let the program itself access stdin).
2.2.12. CANCELED Pipe take outcome
State "CANCELED" from "TODO"
I don't think there's any glaring inconsistency here - indefinite (or i guess I might call them 'unsized') dispensers will dispensenothing
when there's nothing left. That means that when you are using one of these,nothing
is not a valid value you can use in the sequence.That means, for example, that if you wanted to print whether integers are odd or not, you can't quite do that. You'd need to use pairs (the original value and true/[] for whether it's odd).
Perhaps later we can think about signaling end-of-stream out of band. One way to do that is to use an unhandled error value that unwinds the stack, and you have to recover to catch it. But that introduces a lot of complexity and I think it may be easier to just work around the fact that you can't use
nothing
in the data. It's possible that maybe the complexity in the out-of-band impl could be abstracted away, so it's worth revisiting later.
There is some inconsistency with what happens when there's nothing left - empty lists just return nothing on take, but closed pipes return an error. May need to resolve this inconsistency.
List | Handoff | Socket | StaticFile | |
---|---|---|---|---|
take Items | Item | Item | Bytes | Bytes |
take Past EOF | Nothing | Nothing | Nothing | Nothing |
step Past EOF | Exit | Exit | Exit | Exit |
2.3. TODO Error should have actual struct fields optimization
It's still implemented as generic Hashmap data field.
2.4. INPROGRESS Script
- State "INPROGRESS" from "TODO"
2.4.1. DONE Cryptographic primitives
2.4.1.1. DONE SHA256
"foo" bytes hash "fop" bytes hash =
[]
["foo" bytes key] 2 times =
true
"foo" bytes key
[[public #b64 "NNJledu0Vmk+VAZyz5IvUt3g1lMuNb8GvgE6fFMvIOA="] [type elliptic-curve-key] [secret #b64 "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="]]
2.4.1.2. DONE Signing
"foo" bytes key "we attack at dawn" bytes [sign] shield verify
true
"foo" bytes key "we attack at dawn" bytes [sign] shield ;; now change the message [drop "we attack at dawn" bytes] dip verify
#b64 "d2UgYXR0YWNrIGF0IGRhd24="
We need to be able to construct scripts and their hash. What is the public key format? We can sort the assoc so that the serialization is always the same.
"foo" bytes key ;; new key [secret] unassign ;; discard the secret portion [first] sort ;; make sure the assoc is always serialized the same way wrap [sink verify] join emit; bytes hash
"[[[public #b64 \"NNJledu0Vmk+VAZyz5IvUt3g1lMuNb8GvgE6fFMvIOA=\"] [type elliptic-curve-key]] sink verify]"
So this is the script data. Then the high level script (that's always the same) is: we've got inputs, a script, and a script hash. If the hash of the script is equal the given hash, execute the program on the given input.
"[[[public #b64 \"NNJledu0Vmk+VAZyz5IvUt3g1lMuNb8GvgE6fFMvIOA=\"] [type elliptic-curve-key]] sink verify]" bytes [hash] shield
#b64 "SsjPm5GDruW/Ixa/pY97y+Y2JI1+siSETU6yJwlSUvM=" #b64 "W1tbcHVibGljICNiNjQgIk5OSmxlZHUwVm1rK1ZBWnl6NUl2VXQzZzFsTXVOYjhHdmdFNmZGTXZJT0E9Il0gW3R5cGUgZWxsaXB0aWMtY3VydmUta2V5XV0gc2luayB2ZXJpZnld"
Now let's make a signature with that same key
"foo" bytes key "we attack at dawn" bytes [sign] shield
#b64 "sAVOx61lJzZAcVMPNFBeDGjzaSej++hqjLctgr1stVcAMk+L1mSZC7nxbtj5+8rYj99zXKLZX6gQzO8bBvvlAA==" #b64 "d2UgYXR0YWNrIGF0IGRhd24=" [[type elliptic-curve-key] [secret #b64 "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="] [public #b64 "NNJledu0Vmk+VAZyz5IvUt3g1lMuNb8GvgE6fFMvIOA="]]
Now use that data and execute the script on it
[#b64 "sAVOx61lJzZAcVMPNFBeDGjzaSej++hqjLctgr1stVcAMk+L1mSZC7nxbtj5+8rYj99zXKLZX6gQzO8bBvvlAA==" #b64 "d2UgYXR0YWNrIGF0IGRhd24="] emit read first "[[[public #b64 \"NNJledu0Vmk+VAZyz5IvUt3g1lMuNb8GvgE6fFMvIOA=\"] [type elliptic-curve-key]] sink verify]" read first inject first string
"we attack at dawn"
Now let's make a word 'authenticate', that takes a script hash, a script, and its args, and returns true if it's the right script and it validates. Important: check the hash before attempting to execute or even read the script. That ensures that it's what the sender intended (doesn't protect against malicious real sender, just malicious impostors).
#b64 "SsjPm5GDruW/Ixa/pY97y+Y2JI1+siSETU6yJwlSUvM=" ;; script hash #b64 "W1tbcHVibGljICNiNjQgIk5OSmxlZHUwVm1rK1ZBWnl6NUl2VXQzZzFsTXVOYjhHdmdFNmZGTXZJT0E9Il0gW3R5cGUgZWxsaXB0aWMtY3VydmUta2V5XV0gc2luayB2ZXJpZnld" ;; script ;"foo" [#b64 "sAVOx61lJzZAcVMPNFBeDGjzaSej++hqjLctgr1stVcAMk+L1mSZC7nxbtj5+8rYj99zXKLZX6gQzO8bBvvlAA==" #b64 "d2UgYXR1YWNrIGF0IGRhd24=" ] emit bytes ;; the proof (key) as serialized bytes list - the sig and message ;; first check hash [[[hash =] dive] [swap [string read first] both functional [inject] lingo first] [drop drop []] if] [[]] recover [string] bail ;; gives the message and who it's from
[] #b64 "SsjPm5GDruW/Ixa/pY97y+Y2JI1+siSETU6yJwlSUvM="
Try where the actual script is not what the hash requires, should return nothing
#b64 "SsjPm5GDruW/Ixa/pY97y+Y2JI1+siSETU6yJwlSUvM=" ;; script hash "[true]" emit bytes [#b64 "sAVOx61lJzZAcVMPNFBeDGjzaSej++hqjLctgr1stVcAMk+L1mSZC7nxbtj5+8rYj99zXKLZX6gQzO8bBvvlAA==" #b64 "d2UgYXR0YWNrIGF0IGRhd24=" ] ;; data as list ;; first check hash [[[hash =] dive] [swap string read first functional [inject] lingo first] [drop drop []] if] [[]] recover
[] #b64 "SsjPm5GDruW/Ixa/pY97y+Y2JI1+siSETU6yJwlSUvM="
Try where the signature is invalid by substituting a sig from a different message - same key.
"foo" bytes key "we attack at dusk" bytes sign
#b64 "XtOnDCT9+iiHV0BElSAckjo76e2yY3swEOOWo0FfstHgukymw9XXHm7+jLtEBsBjJzo5kyo6058WJ/XPpAe1Aw=="
#b64 "SsjPm5GDruW/Ixa/pY97y+Y2JI1+siSETU6yJwlSUvM=" ;; script hash #b64 "W1tbcHVibGljICNiNjQgIk5OSmxlZHUwVm1rK1ZBWnl6NUl2VXQzZzFsTXVOYjhHdmdFNmZGTXZJT0E9Il0gW3R5cGUgZWxsaXB0aWMtY3VydmUta2V5XV0gc2luayB2ZXJpZnld" ;; lock [#b64 "XtOnDCT9+iiHV0BElSAckjo76e2yY3swEOOWo0FfstHgukymw9XXHm7+jLtEBsBjJzo5kyo6058WJ/XPpAe1Aw==" #b64 "d2UgYXR0YWNrIGF0IGRhd24=" ] ;; data as list ;; first check hash [[[hash =] dive] [swap string read first functional [inject] lingo first] [drop drop []] if] [[]] recover
[] #b64 "SsjPm5GDruW/Ixa/pY97y+Y2JI1+siSETU6yJwlSUvM="
try a dummy script that really does always validate
[true] encode hash "[true]" encode [] [[dump [hash =] dive] [swap string read first functional [inject] lingo first] [drop drop []] if] [[]] recover
[[] #b64 "W3RydWVd" #b64 "M+LwVX3X2/aNUvQNUQxkH9+m5dpgq8cN+sB9K2tsvM8="] [] #b64 "M+LwVX3X2/aNUvQNUQxkH9+m5dpgq8cN+sB9K2tsvM8="
2.4.1.3. DONE Make verify return the message
- State "DONE" from "INPROGRESS"
one thing I hadn't considered before. We receive this package of
"proof" - proof of what? That this message is from the party
represented by the given script hash. What message? It's contained in
the proof. The important thing is that if the proof is good we return
the message. I think a good contract is that we return the message (as
bytes) if it's valid proof, otherwise nothing
. If we only return true
on valid proof then we have to embark on digging out the message from
potentially nested proofs. If we just return the message from each
layer (on success) then we don't have to have this separate logic.
I think it's best to just have the contract of the word verify
do this
for us - there's no reason to just return the truthy value true
when
the message is a perfectly good truthy value. I suppose signing an
empty byte array could cause confusion (if that were considered
"nothing" which I suppose it should, but currently isn't). But I can't
think of any valid reason to sign 'nothing'.
2.4.1.4. TODO AES Encryption
2.4.1.5. TODO Random
2.4.2. DONE Pure functional env
[[pipe-in pipe-out channel timeout handoff file-in file-out timestamps standard serversocket animate future spit tunnel ] [wrap unassign] step] [1 2 swap] lingo
1 2
2.4.3. TODO Infinite loop protection
We need to prevent an attacker presenting true [clone] loop
as their
identity proof, which would never halt. It may be easiest to just
remove all the looping words from the dictionary, but that seems
overly restrictive, when the point is just to limit the resources an
attacker can consume, and we already have a direct solution for that:
[[program [true [clone] loop]]] environment
2.5. TODO retry should have opposite argument order stdlib consistency
Currently it expects an error on ToS and then a program beneath. But
it seems like we'd nearly always have to dip
the program beneath the
error. I think it would be better if retry
expected the program to fix
the issue on top, and the error beneath.
2.6. INPROGRESS Support Kademlia DHT
2.6.1. DONE XOR
We have a node id (maybe just the i2p destination address?) and we want to calculate the distance to another node as the XOR
2.6.2. INPROGRESS Simple API server
Construct a socket listener, and serve something from a trivial local database. Disable exploitable words. Catch errors and return to the user.
;; create an API service ;; ;; Takes from the stack: ;; ;; * a Database (can be a regular data structure for read-only apis), ;; or a pipe to an actual (sql or other) database that accepts queries for ;; read/write ops ;; ;; * a program that modifies the dictionary that clients can ;; access. It should add words to make interaction easier (for ;; example, you might provide a word 'customers' that gets the customers ;; db table). It should also remove words that the clients should not be able ;; to use - for example, they shouldn't be able to create file or network pipes. ;; ;; * a server socket pipe to serve from ;; ;; The client sends a program to run in a fresh environment where he ;; can expect to find: ;; ;; * The database (either a pipe or data structure) ;; ;; His program runs and then the resulting stack is returned to him. ;; ;; socket listener [[type ip-host] [port 12121] [address "127.0.0.1"]] pipe-out ;; book db "examples/books.kcats" file-out slurp read ;functional ;; dictionary modifications, removes any io access ;; API Server code begins here ;dictionary swap execute ;; -> new-dict db sock ;; start building the environment [[program [take ;; the request as bytes swap ;; we want the pipe on top so we can dip the user's program under it -> pipe req db [string ;; translate to a string -> req-str db read first ;; the request program into a data structure -> prog db clone emit print ;; log the request functional [[execute] [] recover] lingo ;; the program -> items* snapshot] dip ;; under the pipe so the user's code has no access swap ;; -> response pipe emit ;; -> response-str pipe bytes ;; -> response-bytes pipe put ;; the response into the pipe drop ;; close the connection ]]] environment ;; -> env new-dict db sock ;[dictionary] float ;; -> new-dict [dictionary] env db sock ;assign ;; -> env db sock ;; now just need to assign the stack, which is [pipe db] float ;; -> sock env db ;; loop to accept connections and start new env with the db and a pipe ;; to take requests and reply [[float ;; -> pipe db env pair ;; -> stack env [stack] swap ;; -> stack ks env assign ;; -> env environment animate ;; let it fly ] shielddown ;; shielded so as not to consume the db each time drop ;; drop whatever the result is of this iteration, we don't need it ] step ;; accepts incoming connections until killed
This works ok for a read-only database, but for the purposes of a DHT we can't do it this way - we'd have to expose the database and there's no way to prevent the api user from making arbitrary (and malicious) changes.
2.6.3. INPROGRESS Simple API client
"localhost" 12121 socket [[[title] lookup count 10 <] filter] encode put [[take] joiner generate string read] shielddown
[[[[[author-first "George"] [author-last "Orwell"] [title "1984"] [year 1949] [subjects [government dystopia surveillance totalitarianism freedom]]] [[author-first "Charlotte"] [author-last "Bronte"] [title "Jane Eyre"] [year 1847] [subjects [love morality society class womanhood independence]]]]]]
2.6.4. TODO Kademlia functions
2.7. DONE read and emit don't have quite the same semantics consistency
- State "DONE" from "TODO"
read will read all the bytes and return however many objects were read. emit will take an object and return its serialization.
There should be some way of round tripping here, maybe a word read1
or
something that just reads one object.
2.8. DONE Inconsistent stack handling when encountering error consistency
- State "DONE" from "INPROGRESS"
I think this is complete, not aware of any more cases. Will reopen if needed - State "INPROGRESS" from "TODO"
Some words pop the arguments off the stack, then if an error is encountered, throws the error without the args on the stack. Others leave the args intact. This needs to be consistent.
I would lean towards leaving the args intact so that retry
is easily applied.
2.8.1. DONE 'read' on invalid edn consumes the string argument
- State "DONE" from "TODO"
It should attempt to parse before popping the item off the stack.
2.8.2. DONE Division by zero consumes stack items
- State "DONE" from "TODO"
5 0 /
shouldn't consume the 5
and 0
- compare to 1 "2" +
behavior
(which leaves items on stack).
2.9. DONE Inconsistent expression handling when encountering error
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
Some errors lose the word on which they occurred. They should be in the expression still.
[[]] [foo] unwrap get
[[type error] [reason "type mismatch"] [asked [pair]] [actual []] [unwound [get]] [handled true]]
The word get
should still be in the unwound
field.
I think this only works correctly when the invalid argument is caught by spec checking and not in the actual axiom function.
1 "" +
[[reason "type mismatch"] [unwound [+]] [actual ""] [type error] [asked [number]] [handled true]] "" 1
Here's an example where the spec is too permissive and the actual function throws the error.
1 set
[[reason "type mismatch"] [type error] [asked [sized]] [unwound [set]] [actual 1] [handled true]]
The question then is how to fix this? Hopefully this can be fixed
inside eval_step
. After the function completes, we can check if there
was an error on top (if there wasn't before), and if so, we can replace the
2.10. TODO Performance optimizations optimization
2.10.1. TODO Compile programs
Here is how it could maybe be done. We already have a type StepFn (which takes an env and returns a new one, in a future).
So let's say we have a program [1 2 +], and we want to convert that
into a StepFn. We could have a function compose
and another
self_insert
, and then call compose([selfinsert(1), selfinsert(2),
plus]), which would return a StepFn.
Let's look at something more complex:
1 2 3 4 [+ *] dip
In this case, the program is the composition of the 5 self-inserts and
dip. But what is self-inserted as the 5th item in this case could be
compiled because we know dip
follows it. How we know in advance a list
can be compiled is difficult.
Let's try this:
0 1 [2 3 4] [[+] dip] step
In this case, the program for step
is easy to spot, and in turn dip
.
How about this:
[+ *] [2 3 4] swap join execute
We can't know the first two programs can be compiled until later on, unless we look ahead in the program. Even then we can only know what arguments end up being passed to join and execute by examining the words' specs, and even that is not foolproof, as we have wildcard specs like dip where the stack change is arbitrary.
One major issue with this optimization is that it will stop the debugger from working properly, unless special care is taken: with the debugger we can go step by step, but if the function composition is bundled up, we can only "step over" that function and not "into" it. I am not sure if it's possible to build this such that we preserve stepping ability and increase performance substantially.
I've tried various approaches to this problem and honestly I can't find anything remotely simple that I'm confident would be a significant performance improvement. I'm not even sure something complex would be fast, short of a full blown compiler. And then whatever compiler that is would have to be available at runtime on all platforms - so I can't just compile to C source because I would then have to ship a C compiler too.
2.10.2. TODO Programs as their own immutable type
Programs executing in a loop are generally not modified (exception -
the recur
word, which can modify but usually just calls execute
)- so
when we execute a program with loop
we don't want to have to clone it
each time through the loop.
Instead we'll do the following: when loop
places a program into the
program, instead of joining it, it's just going to put it right on
top as a program
- we may need to differentiate programs that are
active vs meant to be run later. When eval-step
runs, it sees an
active program on the top of the program, so it calls next
and gets
a reference to the next word (or None if it's at the end, drop the
program). Then we lookup that word. If it's an axiom, we call it. If
it's derived, we place a new program on the top of the program,
with its PC set to 0. The actual programs are immutable, and behind an
Rc. Each "copy" of the program is just an Rc and a counter. Then all
programs are references except the counter.
example program:
[flip drop] ;;0 [float swapdown] [flip drop] ;; 0 1
[[+] shield] ;; pc 0 [[+] shield] ;; pc 1 [[snapshot] dip inject first] [[+] shield] ;; pc 0 1 ;; etc
So when printing out the program, we could cheat and only show the remaining program (instead of a stack of partially executed programs).
2.11. INPROGRESS Generators stdlib
2.11.1. DONE Basic functionality and generators
There's the concept of "lazy sequence" that I think maps nicely to pipes - you can keep calling 'take' and it keeps calculating new values. Everything it needs is contained in the object, it's not like a network or filesystem pipe where the data is coming from somewhere external. But it acts like a pipe.
0 [] ;; the producer - infinite seq of integers [[inc clone] dip swap put] ;; -> [1] 1 ;; the filter condition [3 mod 0 =] ;; divisible by 3 ;; filter-xf [pop] swap put [[put] [drop] if] join join ;; [generation filtration] [] 0 clone [execute] dip ;;generate ;; [3] clone [execute] dip ;;generate ;; [3] clone [execute] dip ;;generate ;; [3] clone [execute] dip ;;generate ;; [3]
The problem above is generate
will not produce a value until one
passes the filter. I think filter needs to keep calling generate
on the xf below it?
[[inc clone] dip swap put pop [3 mod 0 =] [put] [drop] if] [3] 4
1 [[unwound [[[[inc clone] dip swap put [pop [3 mod 0 =]] [put] [drop] if]] unwrap]] [type error] [asked [packable]] [actual 1] [reason "type mismatch"] [handled true]]
;; the impl of filter-xf [3 mod 0 =] [pop] swap put [[put] [drop] if] join
[pop [3 mod 0 =] [put] [drop] if]
0 [inc clone] clone [execute] dip swap drop clone [execute] dip swap
2 [inc clone] 2
0 [inc] [] [[generate] dip] ]
[[generate] dip] [] [inc] 1
[ ;;[1 2 3 4 6 9] liberate ;; produce from list 1 [2 * clone] ;; infinite list ;; increment each ;;[3 * 3 -] each ;; drop the first few 5 dropper ;; limit the list 10 taker ;; collect into list collect ] shield
[64 128 256 512 1024 2048 4096 8192 16384 32768]
0 [inc clone] generate
1 [inc clone] 1
Now express the debugger interface in terms of generated environment states!
;; the steps of execution [[program [0 0 10 1 range [+] step]]] environment [[[program] lookup something?] [eval-step clone] [[]] if] ;; the generator, which needs to emit 'nothing' once the program is empty [[stack] lookup] each 50 taker laster generate
[[reason word is not defined] [unwound [laster generate]] [type error] [asked [laster]] [handled true]] [[positive?] [dec [generate] dive] [[]] if] 50 [generate [[[stack] lookup] bail] shielddown] [[[program] lookup something?] [eval-step clone] [[]] if] [[stack []] [program [0 0 10 1 range [+] step]]]
implement 'laster' which returns only the last in the seq
0 100 1 range liberate laster generate
[1 2 3] traversal ;; a generator for the list [inc] each collect
99 [generate [] swap [] [swap drop [generate] dip swap] while drop] liberate []
Now implement 'keep' which returns only an item that passes the filter
0 [inc clone] [odd?] keep 1 dropper 10 taker [clone *] each collect
[9 25 49 81 121 169 225 289 361 441] [generate [[clone *] bail] shielddown] [[positive?] [dec [generate] dip swap] [drop []] if] [[[positive?] [[generate drop] dip dec] while [generate swap] dip swapdown swap] bail] 0 [clone [[generate] dip [drop generate] while] dip swap] [[[something?] [odd? not]] [execute] every?] [inc clone] 21
[odd?] [something?] swap pair wrap [every?] join ;; [odd? not]
[[[something?] [odd?]] every?]
dropper (almost got it, doesn't detect end of parent stream yet)
[0 20 1 range liberate 5 dropper 10 taker [5 *] each [odd?] keep collect] shield
[25 35 45 55 65]
Collect fix
[1 2 3] liberate generate ;; n [] swap clone ;; n n r [put ;; r [generate] dip ;; r n swap clone] ;; n n r loop drop
[1 2 3] liberate []
integers 10 taker collect drop generate
[] [[positive?] [dec [generate] dive] [[]] if] 0 [inc clone] 9
2.11.2. DONE map
2.11.3. DONE filter
2.11.4. DONE take
2.11.5. DONE drop
integers 15 taker 10 dropper [+] reduce
60 [[[positive?] [[generate drop] dip dec] while [generate swap] dip float] bail] 0 [[positive?] [dec [generate] dive] [[]] if] 0 [inc clone] 14
2.11.6. DONE drop-while (skipper)
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO" This is what drop-while looks like
[] [take] [positive?] [] ;; the state (whether threshold reached) [[] ;; condition - whether we've finished dropping or not [[generate] divedown] ;; true - pass everything else through [[[generate] divedown] ;; prime init [[[clone] divedown execute] bail] ;; bring pred up and exec it [drop] ;; if pred passes drop the value prime ;; after this should have value on top [drop true] dip ;; set flag ] ;; false - generate, check pred, repeat if] collect
2.11.7. DONE take-while (catcher)
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
[1 2 3 ] [take] [positive?] [[generate] dive [[[clone] dive execute] bail not] [drop []] when] collect
[1 2 3] [[generate] dive [[[clone] dive execute] bail not] [drop []] when] [positive?] [take] []
2.11.8. CANCELED last
2.11.9. TODO distinct
depends on sets
The difference between this and just calling set
is that the result is
still a list, and it preserves the original order, just removes
duplicates. Should be a similar impl to keep
.
[1 1 3] liberate [] set ;; state [[generate] dive ;; n seen g [contains?] [put ;; seen g [generate] dive] ;; n seen g while ] collect
[1 1 3] [[generate] dive [contains?] [put [generate] dive] while] [] [take] []
2.11.10. DONE partition
- State "DONE" from "TODO"
[1 2 3 4 5 6] [take] 2 [] [[generate] dive] [[[] [drop count inc ]] [execute] every?] []
[] [[[] [drop count inc]] [execute] every?] [[generate] dive] [] 2 [take] [1 2 3 4 5 6]
2.11.11. DONE joiner (aka catenate)
[[1 2 3] [4 5 6] [7 8 9]] liberate [generate [] swap [] [join [generate] dip swap] while drop] generate
[1 2 3 4 5 6 7 8 9] [generate [] swap [] [join [generate] dip swap] while drop] [take] []
2.11.12. DONE groupby
- State "DONE" from "TODO"
Implemented as `group`
["foo" "bar" "baaz" "quux"] liberate ;; (the next word foo) ;liberate ;; (the first letter f) [take] wrap [shield ;; k v state wrap swap ;; v k state wrap [put] join update] join [] association ;; state f swap cram
[[\q ["quux"]] [\f ["foo"]] [\b ["bar" "baaz"]]] [take] []
Ok so now we just need to insert the [take] program instead of specifying it inline.
[1 2 3 4] [take] [odd?] group
[[true [1 3]] [[] [2 4]]] [take] []
2.11.13. CANCELED Map/filter can't access lower stack items
2.11.13.1. Problem
this doesn't work:
10 [1 2 3] liberate [+] each
[generate [[+] bail] shielddown] [take] [1 2 3] 10
We should get [11 12 13]
but it errors out.
The reason is that when + runs, the generators are still on the stack, in between this mapping function, and the original stack arguments.
We need a way to break out of the generation part of the stack and let the mapping function access the arguments below it.
I can't immediately think of a good way to do it.
Actually I think that instead of recursively calling generate, and passing the values back up the stack, there might be a way to build up the program recursively, and then execute it in one swoop?
Perhaps we can split each stage into several parts:
- Generate from the layer below (in which case we obviously need the layers below to get the next value)
- dip underneath the layers to calculate the next value using lower stack items
- swap the new value to the top of stack
2.11.13.2. Debug session
[[program [10 [1 2 3] liberate [+] each generate]]] environment advance advance advance advance eval-step [advance] 5 times eval-step [advance] 2 times [eval-step] 99 times
10 [1 2 3] liberate [+] each generate
[[asked [number]] [reason "type mismatch"] [unwound [+ [[1 [take] [2 3] 10]] unwrap evert first swap drop [[generate [[+] bail] shielddown]] unwrap swap]] [actual [take]] [type error] [handled true]] 1 [take] [2 3] 10
[[program [[[program [+]]] environment advance]]] environment advance advance eval-step
[[program [[[program] lookup count] shield swap [[program] lookup count [[positive?] [<=]] [execute] every?] [eval-step] while swap drop]] [stack [[[stack []] [program [+]]]]]]
2.11.13.3. Resolution
After thinking about this some more, my conclusion:
This is supporting multi-arity mapping functions, which did work in the original map implementation but they are not supported in other languages. The way you access multiple values there is by closing over them. So the way you'd do it in kcats is like so:
10 [1 2 3] ;; the extra arg and the list [-] ;; the multi-arity map fn [clone] dipdown ;; clone the 10 [swap] unwrap prepend ;; prepend the word swap to the fn so that the 10 ends up beneath the list item float prepend ;; prepend the 10 map
[9 8 7] 10
In theory we could write a helper function called capture1
or something that does this for us, so you can write
10 [1 2 3] [-] capture1 map
10 [1 2 3] ;; the extra arg and the list [-] ;; the multi-arity map fn [swapdown ;; f i [swap] unwrap prepend swap prepend] shielddown [liberate] dip each collect
[9 8 7] [generate [[10 swap -] bail] shielddown] [take] [] 10
[1 2 "oh fudge"] [[5 +] [drop 5] recover] map
[6 7 5]
2.11.13.4. DONE Add functions to help capture environment for map/filter fns
It's too difficult to do this manually.
1 [2 3 4] [+] map
we want to redesign this so that we build the mapping function first:
1 [+] capture [2 3 4] swap map
[3 4 5] 1
and the generator equivalent
5 [* inc] capture [integers 100 dropper 10 taker] dip each collect
[501 506 511 516 521 526 531 536 541 546] [generate [[[5] swap [unwrap] dip * inc] bail] shielddown] [[positive?] [dec [generate] dive] [[]] if] 0 [[[positive?] [[generate drop] dip dec] while [generate swap] dip float] bail] 0 [inc clone] 109 5
2.11.14. DONE Reduce
0 [inc clone] 30 taker [+] [generate] dive clone ;; acc acc f ;;drop [generate] divedown [] [float execute clone] [] if ;; acc f g [[generate] divedown ;; i acc f g [] [float execute clone] [] if] ;; acc acc f g loop
0 [inc clone] 10 taker generate clone ;; acc acc ;;drop [generate] divedown [] [float execute clone] [] if ;; acc g [[generate] dive ;; i acc g [] [+ clone] [] if] ;; acc acc f g loop
55 [[positive?] [dec [generate] dive] [drop []] if] [inc clone] 10
0 [inc clone] 3 taker [*] ;; build the 'then' branch [clone] join ;; -> [+ clone] ;; build the loop body [[generate] dive []] swap put [[] if] join ;; generate the first item under the loop body [generate clone] dip loop
6 [[positive?] [dec [generate] dive] [[]] if] 0 [inc clone] 3
1 2 3 4 [+] divedown
3 4 3
1 true [ inc clone 5 < ] loop
5
integers 1 dropper ;; start with 1 1000 taker ;; take items [3 *] each [odd?] keep [+ 37 mod] reduce
10 [clone [[generate] dip [drop generate] while] dive] [[[something?] [odd? not]] [execute] every?] [generate [[3 *] bail] shielddown] [[positive?] [dec [generate] dive] [drop []] if] [[[positive?] [[generate drop] dip dec] while [generate swap] dip float] bail] 0 [inc clone] 1000
1025 8 mod
1
let's make an equivalent to map
(that doesn't require a generator) for ease of use
0 [1 2 3 4] [+]
…wait a minute, isn't that just step
?
2.11.15. CANCELED Generator combinators?
- State "CANCELED" from "TODO"
Not sure there's anything to do here.
When writing partition
, it would be nice if we could use generators
within a generator. For example, we need to partition a list into
pairs. It would be nice if we could use 2 taker
repeatedly. Let's see if we can make that work:
[1 2 3 4 5 6 7] [take] [2 taker collect dropdown dropdown] collect
[[1 2] [3 4] [5 6] [7]] [2 taker collect dropdown dropdown] [take] []
Ok wow did not expect that to be so easy.
Maybe we can even implement the window shifting version?
[1 2 3 4 5 6 7] [take] 3 1 ; params: window-size, shift-size, state [] [[[dotake [[taker collect dropdown dropdown] ; drop the used-up taker generator join divedeep]] [doshift [[[count <=] [swap 0 slice] [[]] if] shield swap]]] [ [] [over wrap dotake [join doshift] bail] [[over] dive wrap dotake swap drop doshift] if] draft] collect
[[1 2 3] [2 3 4] [3 4 5] [4 5 6] [5 6 7]] [[[dotake [[taker collect dropdown dropdown] join divedeep]] [doshift [[[count <=] [swap 0 slice] [[]] if] shield swap]]] [[] [over wrap dotake [join doshift] bail] [[over] dive wrap dotake swap drop doshift] if] draft] [6 7] 1 3 [take] []
2.11.16. DONE Applying generator to an existing container
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
we commonly have this construct: [[1 2 3] ... collect] shield
, where
we're transducing a list and we want to just get the result.
[1 2 3 4 5] [[odd?] keep] [xform dispenser] label [[[poke dispenser] [take] [splice xform] collect] shield] template execute
[1 3 5]
2.11.17. INPROGRESS Combinations
- State "INPROGRESS" from "TODO"
[1 2 3 4 5] [[count] shield -1 ;; l idx i [[[swap count =] dive] [drop drop take 0 swap] when [[wrap lookup] dive [pair] bail] shield [inc] dipdown] ;generate drop generate drop generate drop generate collect] shielddown
[[1 2] [1 3] [1 4] [1 5] [2 3] [2 4] [2 5] [3 4] [3 5] [4 5]]
This isn't quite right because we're not using a lower generator as the source, even though we could. We could start with an empty list as the state, call generate, and then yield pairs for that item and every item in the state. Then add it to the state and continue until the lower generator yields nothing.
"combinations" may not be a good name since it implies all combinations and not just pairs. Maybe put this on hold until we actually need it.
Ok let's do it this way, here's a python version:
def generate_recursive(items_generator, current_combo, remaining_arity): if remaining_arity == 0: yield current_combo else: for item in items_generator(): yield from generate_recursive(items_generator, current_combo + [item], remaining_arity - 1) def generate_combinations(items_generator, arity): for item in items_generator(): yield from generate_recursive(items_generator, [item], arity - 1) # Example usage with a generator: def item_generator(): for i in range(1, 4): yield i arity = 3 result = [] for combo in generate_combinations(item_generator, arity): result.append( combo) return result
1 | 1 | 1 |
1 | 1 | 2 |
1 | 1 | 3 |
1 | 2 | 1 |
1 | 2 | 2 |
1 | 2 | 3 |
1 | 3 | 1 |
1 | 3 | 2 |
1 | 3 | 3 |
2 | 1 | 1 |
2 | 1 | 2 |
2 | 1 | 3 |
2 | 2 | 1 |
2 | 2 | 2 |
2 | 2 | 3 |
2 | 3 | 1 |
2 | 3 | 2 |
2 | 3 | 3 |
3 | 1 | 1 |
3 | 1 | 2 |
3 | 1 | 3 |
3 | 2 | 1 |
3 | 2 | 2 |
3 | 2 | 3 |
3 | 3 | 1 |
3 | 3 | 2 |
3 | 3 | 3 |
[[combos [[] [[swap 1 =] [[generate] divedown clone wrap [put] dip] [[clone dec swap] dip swap [self enumerate generate] dipdown ] ;; put a generator for tuples one smaller and get the first one if]]] ] [[a b c d] [take] 2 combos generate] draft
2 [[swap 1 =] [[generate] divedown clone wrap [put] dip] [[clone dec swap] dip swap [combos enumerate generate] dipdown] if] [] [0 [a]] [[generate] dive [[pair] shielddown [inc] dip] bail] 1 [[swap 1 =] [[generate] divedown clone wrap [put] dip] [[clone dec swap] dip swap [combos enumerate generate] dipdown] if] [a] 1 [take] [b c d]
enumerate generator
[a b c d] [take] 0 [[generate] dive [[pair] shielddown [inc] dip] bail] collect
[[0 a] [1 b] [2 c] [3 d]] [[generate] dive [[pair] shielddown [inc] dip] bail] 4 [take] []
[a b c d e] 1 3 [] [[= not] dive] [ [[1 =] [[clone] dive] [] if] divedown] when
[a b c d e] [] 3 1 [a b c d e]
2.11.18. DONE Frequencies
- State "DONE" from "TODO"
Given a generator, keep track of how many times each value occurs.
[1 2 3 1 2 3 4 -1 3 3] [take] [] association [wrap [[] [inc] [1] if] update] cram
[[-1 1] [1 2] [2 2] [3 4] [4 1]] [take] []
[program [3 [0 >] [clone dec] while]] ;; the sample program to run tracer [[program] lookup first] each ;; what item is being executed [word?] keep ;; only words ;;frequencies [] association [wrap [[] [inc] [1] if] update] cram
[[actual [[dictionary dictionary_redacted] [program [program [3 [0 >] [clone dec] while]]] [stack [[[asked [program]] [handled []] [reason "word is not defined"] [type error]]]]]] [asked [program]] [handled yes] [reason "type mismatch"] [type error] [unwound [shielddown [[generate shielddown]] unwrap swap [[]] unwrap swap [[[generate] dive]] unwrap [[]] unwrap [[wrap [[] [inc] [1] if] update]] unwrap float join while drop]]] [[dictionary dictionary_redacted] [program [program [3 [0 >] [clone dec] while]]] [stack [[[asked [program]] [handled []] [reason "word is not defined"] [type error]]]]] [eval-step clone] [[dictionary dictionary_redacted] [program [program [3 [0 >] [clone dec] while]]] [stack [[[asked [program]] [handled []] [reason "word is not defined"] [type error]]]]]
[2 [0 >] [clone dec] while] ;; the sample program to run [tracer [[program] lookup [first] bail 0 or] each ;; what item is being ;; executed, don't emit [] ;; or the execution stops, ;; use 0 instead [word?] keep ;; count only words frequencies] shielddown
[[> 3] [clone 6] [dec 2] [decorate 2] [decorated 1] [dip 11] [dipdown 1] [evert 12] [execute 1] [first 3] [inject 3] [join 1] [loop 3] [put 3] [shield 3] [snapshot 3] [step 5] [swap 4] [take 3] [unwrap 14] [while 1] [wrap 3]]
[foo bar [] quux] [take] [word?] keep collect
[foo bar] [clone [[generate] dip [drop generate] while] dive] [[[something?] [word? not]] [execute] every?] [take] [quux]
2.12. TODO Make floats hashable
This will allow floats to be added to the KeyItem
enum. Floats are not
normally hashable, because mathematically identical numbers are not
always represented the same way in memory and wouldn't hash the
same. But for the purposes of kcats, I think this doesn't matter. We
can document that you can't expect (10.0 + 10.0) and (15.0 + 5.0) to
be the same map key.
This will then allow a list that contains floats, to be sorted, or be able to use float values as a sort-by key.
2.13. DONE Implement sorting stdlib
- State "DONE" from "INPROGRESS"
2.13.1. DONE Implement partialord
- State "DONE" from "TODO"
Each type needs to be comparable to another.
[["b" 2]["g" 5]["a", 1]["d" 4] ["c" 3]] association sort-indexed
[1 2 3 4 5]
[-2 10 -8 -12 8 0 1 20] [5 - abs] [clone] swap join [ pair] join map sort-indexed
Pair is (Int(-2), Int(7)) Pair is (Int(10), Int(5)) Pair is (Int(-8), Int(13)) Pair is (Int(-12), Int(17)) Pair is (Int(8), Int(3)) Pair is (Int(0), Int(5)) Pair is (Int(1), Int(4)) Pair is (Int(20), Int(15)) [8 1 10 0 -2 -8 20 -12]
UHOH
["hi" "there" "what" "is" "your" "birthdate" "homeboy"] [] [clone] swap join [pair] join map sort-indexed
Pair is (Iterable(Sized(String("hi"))), String("hi")) Pair is (Iterable(Sized(String("there"))), String("there")) Pair is (Iterable(Sized(String("what"))), String("what")) Pair is (Iterable(Sized(String("is"))), String("is")) Pair is (Iterable(Sized(String("your"))), String("your")) Pair is (Iterable(Sized(String("birthdate"))), String("birthdate")) Pair is (Iterable(Sized(String("homeboy"))), String("homeboy")) ["birthdate" "hi" "homeboy" "is" "there" "what" "your"]
8 5 -
3
1 2 [inc] both
3 2
2.13.2. DONE Implement compare
- State "DONE" from "TODO"
Should expose Rust's comparison function. That will allow a native sort function, for max flexibility (but not performance).
"a" "b" compare
less
"a" "a" compare
equal
["a" "b"] ["a" "c"] compare
less
"foo" encode [1] compare
less
This should work - the empty set and map maybe can't be compared but Nothing should be in there.
[] -1000 compare
greater
2.14. TODO Stream transformation
Problem: kcats doesn't speak http or https or various other protocols and formats, but rust does. We want to be able to use the complicated bits of rust, but let kcats decide how to combine them.
Implementation: I think we may need to create a new Rust enum Item type, that acts as a generator. It has an input method "nextinput" that takes an input chunk which is the result of the generator beneath, then it either returns None (updated the state, but no new item yet), or Item (got a new item), or some signal for end of stream, or Error. So it would have some program with a while loop to iterate. All such transforms would probably have the same program.
I think what I am getting at here is that Items should implement Rust traits where possible, eg Read/Write for file and network pipes.
2.15. INPROGRESS Select from multiple pipes
A basic select (which I call attend
) is in place.
2.15.1. TODO Attend should leave the pipe list argument
A lot of callers would want to re-use that argument so it shouldn't need to be shielded by default.
2.15.2. TODO Better error handling
There's lots of places where flume could throw an error and we don't do anything about it.
2.16. TODO Monitoring tools
2.16.1. TODO Reporting back to the mothership
When we spawn/animate, the environment is in its own universe and the main environment has no way to get any information about it, except by whatever means are baked into the spawned env's program. Users can come up with their own scheme of sending some kind of result via a pipe, of course. But what happens if the program encounters an error?
It would be nice to wrap the program such that it reports the final stack via a pipe, back to the main environment. And in the main env, it would be nice to keep a list of those pipes so we can select and get updates. Note, need to compare and contrast with the existing mechanism in 'future'.
Another nice tool would be the ability to send the current state back on demand (sort of like a thread dump) - in the spawned env, call eval-step until some signal comes in on the pipe from the main env, then send back a copy of the env. This mechanism could be used later to implement a monitoring tool.
How to do this: I think a combination of "channel of channels", and
redefinition of spawn
with let
should go a long way. The
channel-channel lets new nested envs send back reply channels to the
master env, even if they are deeply nested. Redefining spawn
lets us
insert the code to send those channels back (by passing in the channel
that leads back to the master env). What would be really handy is
parsing the inner env data to see which references to channels it
contains, seeing whether it's a sender or receiver, and drawing arrows
between envs so users can see they talk to each other.
2.16.2. TODO Monitoring UI
We could show not only all the envs and ther recent state (perhaps dumped every few seconds), we could show arrows between environments that represent pipes (if two envs have a copy of the same pipe anywhere in the stack or program, draw an arrow. If one env has a sender and the other receiver, then show an arrow indicating the direction of data flow along with the pipe id.
We could also allow views into a particular pipe where we copy the last handful of values to pass through (this is doable for channels but probably not file/network pipes).
2.17. INPROGRESS Native REPL
2.17.1. DONE Main mode of reading program from cmdline or file
- State "DONE" from "TODO"
2.17.2. INPROGRESS REPL as a kcats program
- State "INPROGRESS" from "TODO"
Read inputs from stdin, eval in a nested env, write to stdout.
standard [take] [string] each "" [[[complete? [swap count <=]] [readcount [[take] "\n" split generate [[drop] 4 times] dip [read first] bail 0 or]]] [[[generate] dive [] [join] [drop] if readcount] [complete? not] [[generate] divedown swapdown join swap] prime] draft cut] [read] each collect
[] [generate [[read] bail] shielddown] [[[complete? [swap count <=]] [readcount [[take] "\n" split generate [[drop] 4 times] dip [read first] bail 0 or]]] [[[generate] dive [] [join] [drop] if readcount] [complete? not] [[generate] divedown swapdown join swap] prime] draft cut] "" [generate [[string] bail] shielddown] [take] [[peer standard] [type tunnel]]
2.18. CANCELED Words that quote programs instead of executing them
State "CANCELED" from "TODO"
I don't think there's really any good fix for this.For example 'partition' that needs to insert some state before the generator. The generator doesn't actually include that state but it needs to be there. So the quoted program is needed.
eg liberate
- it is just [take]
, so it doesn't actually do anything by
itself. It seems like the quotedness should remain and maybe the word
should always perform the action.
In that case we would have to write 5 [taker]
. I'm not thrilled with
that either, but maybe it's just not a good name.
It does seem like there's an inconsistency having a word quote a program instead of the caller doing it.
There are certainly words that operate on programs without executing
them (like each
which just modifies the mapping function to call the
generator below it, it doesn't add an entirely new generator to the
stack) but the word is still executing a program vs just
self-inserting one object.
So I think I do have to fix this. I'm just not sure what to do.
I think it will look inconsistent to write:
integers 5 [taker] [inc] each
People will see that and wonder why taker
is quoted but not each
. It
makes sense when you dive into it.
Isn't this just an implementation detail? In theory each
could be a
separate generator and honestly it probably should be.
2.18.1. TODO Get rid of self-inserting programs (esp with generators)
Problem: words like joiner
and taker
don't do anything except insert a
program. That, i think, should be an anti-pattern in kcats. If you
want to put a program on the stack succintly,then define a word that
does what you want, and quote that word.
For example, some complex function foo
, if you want a program on the
stack that does what foo does, use [foo]
. foo
itself should perform the action.
There's some confusion because each
does perform an action: it
modifies the program already on the stack. So you would write 5 [taker]
but [inc] each
.
One issue is with something like partition
where there's boilerplate initial state
that needs to go on the stack before the generator program. With taker
the user provides the initial state because we don't know what it is in advance.
[a b c d e] [take] 2 2 partition generate
[[asked [taker]] [handled yes] [reason "word is not defined"] [type error] [unwound [taker collect dropdown dropdown [2] unwrap [2] unwrap float [[]] unwrap swap swap drop shift]]] [[[dictionary dictionary_redacted] [program [[] [over wrap take-chunk [join shift] bail] [[over] dive wrap take-chunk swap drop shift] if]] [resolver [#b64 "yO3LwN0ITlhqAj8T1IKcUqNoiQmEAyrBwbFpGixDtQ8="]]] [stack] [snapshot] divedown assign environment evaluate [stack] lookup restore] 2 [take] [a b c d e]
It's possible for partition to check if the state is present and create it (since the state is always a list and otherwise it would see a number). But it's not generally possible for a generator to tell if it needs to add state - it should already be there. So if we're just quoting the generator, then what will add it?
Self-insert: can insert
2.19. TODO Data compression
Data streams that we intend to produce later are going to need compression - the streams should be as small as possible (they'll be encrypted later so it's too late to compress them after that). lz4 maybe?
2.20. TODO Multimethod improvements
2.20.1. TODO Convert to multi
2.20.2. DONE Refactor addmethod
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
[[[[count 3 >] ["foo" put]] [[not] ["bar" put]]] decide] [count 1 =] [rest] pair ;; [c b] [[[...]] decide] wrap [prepend] join [[0]] dip update
[[[[count 1 =] [rest]] [[count 3 >] ["foo" put]] [[not] ["bar" put]]] decide]
[[hash definition] [[type [foo] unwrap =] [drop "foo" hash] addmethod] update] [ [[foo bar]] association hash] lingo
#b64 "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
[[foo bar]] association type
foo
2.20.3. DONE ismulti?
- State "DONE" from "TODO"
2.21. CANCELED run multiple programs on same argument to get list
- State "CANCELED" from "TODO"
I think this is clear enough, no new word needed
5 2 [[+] [*]] [execute] map
[7 10] 2 5
2.22. INPROGRESS pairwise operations
- State "INPROGRESS" from "TODO"
1 2 3 4 5 [] both] [[] evert [2 2 partition] assemble] dip inject [joiner] assemble unwrap [] swap evert drop
9 5 1
this generator based impl doesn't support nil values on the stack:
1 2 [] 3 4 [swap] pairwise
[[type error] [reason "not enough items on stack"] [unwound [swap [[]] unwrap evert [joiner] assemble unwrap [] swap evert drop]] [asked [consume]] [handled yes]] [4 3]
2.23. DONE Non-generator filter
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
map now doesn't require you to bind values (the map function has access to the rest of the stack).
Do the same for filter
.
5 [1 2 3 4] [+ odd?] [] sink ;; put empty results below list [shield dip] decorate ;; run map fn shielded and dipped under result [swap] unwrap prepend ;; start by swapping the result back to the top [ swap [] [drop swap put] [drop dropdown] if] join ;; end by checking pred, add to result step
[2 4] 5
2.24. INPROGRESS Modules
- State "INPROGRESS" from "TODO"
2.24.1. Problem statement
2.24.1.1. TODO Efficient use
When code is defining new vocabulary, and it gets called in a tight loop, it should not be modifying the dictionary each time. Dictionary shoud be append only (to avoid having to swap back and forth between a modified and unmodified version)
;; save a modified env or dictionary and use it ;; again. [crypto] stdmod ;; make the module ;; apply it to the dictionary ["foo" encode hash] confine
#b64 "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
Let's make a word to load multiple modules
[crypto time] ;; apply the changes in order dictionary swap [decache string read shielddown] step ;; now execute
dictionary_redacted
Let's divide up the functionality:
- reading from cache and parsing (module)
- apply module to dictionary (inscribe)
- spawn env with new dict and program (spawn)
- capture result as our stack
2.24.1.2. TODO Modification happens once per program
Dictionary modification should not happen at "runtime" (when the program is actually being executed) - it should be modified when the program is built. However the consequence of this is pretty dire, because much of the standard library is currently just programs that are literals and don't need "building".
2.24.1.3. TODO Nested library calls need to work
I should be able to call af
that loads (or depends on b
and calls bf
that loads c
and calls cf
, and those loads should only happen once
even if i call af
in a tight loop.
2.24.1.4. TODO Code should be shareable
That means, local names should generarlly not appear in code, as they change meaning.
2.24.1.5. TODO Building vocabulary and the programs that use that vocabulary need to be separable
(that's the whole point of a library). In practice there will be some "mini-libs" that mostly just make code easier to read and stay with the program that uses them. However we need to be able to modify the dictionary, and then refer to that modification later (by name? hash?), to support the typical library use case. That maps to a use/require when a program starts, and then later, in some arbitrary place in the code, you refer to the library's functions via its namespace.
2.24.1.6. TODO Sandboxing
- TODO Preservation of meaning
We execute untrusted code, then our own code, our code should mean the same thing as it would have before executing the untrusted code.
- TODO Access control
We should be able to execute untrusted code in a limited environment (where, for example, it does not have access to the filesystem etc).
- TODO Fine grained access control
We could for example, limit filesystem access to a particular directory, or network access to a particular host. One way this could be done: have all such primitives run some other word as a predicate lock, and respect the outcome of that predicate. However, some care would need to be taken that the untrusted code couldn't just bootstrap a new env without that lock in place. I'm not sure this is possible within the overall language design.cause
Integrating an authentication scripting language directly into the core of your stack-based language and leveraging it for controlling access to sensitive operations could indeed be a powerful and flexible solution. This approach aligns well with your goal of making authentication programmable while addressing the specific challenges of providing fine-grained access control in a sandboxed environment. Here's how to address potential concerns and make the most out of this integration: Design Principles
Efficiency by Design: Since performance is a concern, designing the authentication script execution to be as lightweight as possible is crucial. Optimize the most common authentication paths to reduce overhead. Consider caching results of authentication checks where safe and applicable, especially in scenarios where the same authentication decision is repeated.
Conditional Authentication Checks: Implement the authentication scripts to run conditionally, i.e., only in contexts where sandboxing is required. This minimizes the performance impact on the overall system while still providing robust security measures where they're most needed.
Customizable Script Complexity: Allow the complexity of the authentication scripts to be tailored according to the security needs of the sandbox environment. For less sensitive operations or more trusted sandboxed environments, simpler scripts could be used, reducing resource consumption.
Practical Implementation
Authentication Context: Provide a rich context to the authentication scripts, including details about the requested operation (e.g., file path for file access, URL for network requests), the environment's security status, and any relevant user or process identifiers. This enables writing precise and effective authentication logic.
Digital Signatures and Proof Checking: As part of the authentication scripts, leverage digital signatures for verifying the integrity and authenticity of the scripts themselves or any other supplied credentials. Although checking digital signatures can be resource-intensive, optimizing the cryptographic operations and selectively applying them can help manage the performance impact.
Expandable Security Model: By integrating authentication scripting into the core, you lay a foundation that's not only useful for sandboxed environment control but can also be expanded for broader security features in the future, such as secure inter-process communication or encrypted data storage, using the same flexible scripting approach.
Security and Performance Balance
Asynchronous Operations: When possible, make the authentication checks asynchronous, especially for I/O bound tasks like network requests or disk access. This can help mitigate the performance impact on the rest of the system.
Resource Limits and Quotas: Introduce limits on the resources that authentication scripts can consume, such as CPU time or memory usage. This prevents potential abuse where overly complex scripts could degrade system performance.
In sum, your approach to incorporate an authentication scripting layer directly into the core of the language introduces a novel mechanism for fine-grained access control, effectively turning authentication checks into programmable gatekeepers for sensitive operations. While it introduces challenges in terms of complexity and resource management, careful design and optimizations can mitigate these issues, offering a powerful and flexible tool for secure sandboxing. I guess the question is, can the lock be unalterable? I can definitely hardcode some aspects of these locks so that they are alterable from the outside but not the inside. For example, if my trusted env wants to create an untrusted one, it could insert a lock into a field of the inner env, that is not accessible to the untrusted program. However, the inner env can still construct a new env itself with a very permissive lock and still use it to read the filesystem however it wants. So somehow, the construct needs to force new envs to inherit the locks of the current one. That's possible to implement. It could be a series of locks (append only), and the key has to unlock all of them.
> I think the locks don't really even need cryptography in that case: the lock would say "can only read/write to /tmp" and the key would just be the file that the environment is trying to access. If it starts with /tmp then it is allowed.
Yes, designing a lock mechanism that is both unalterable from inside the untrusted environment and inheritable for any subsequently created environments is indeed possible and can effectively secure the system according to your requirements. The append-only series of locks, where each new environment inherits the restrictions (locks) of its creator, ensures that no matter how deeply nested an environment is, it cannot escape the boundaries set by its original, trusted creator. Here's how such a mechanism could be structured: Lock Mechanism Design
Immutable Locks: Implement locks as immutable structures within each environment. Once a lock is set, it cannot be modified from within that environment. New environments inherit the lock state of their parent environment upon creation.
Lock Inheritance and Aggregation: Ensure that when an environment creates a new environment (even an untrusted one wants to create another environment), the new environment inherits all the locks of its parent environment, potentially adding its own restrictions but never relaxing any existing ones. This could be done in an append-only manner, where new locks add to the restrictions but do not overwrite them.
Path-Based Access Control: Your description of using locks to control file access based on paths (e.g., "can only read/write to /tmp") is an example of path-based access control. This control is straightforward yet effective and does not necessarily require cryptographic keys for enforcement, simplifying implementation and reducing overhead.
Implementation Considerations
System-Enforced Lock Checks: The core system that mediates access to resources (file system, network, etc.) must enforce the lock checks before granting any operation. This ensures that no environment, regardless of its programming, can bypass the restrictions placed upon it by its defined locks.
Auditing and Monitoring: Although the lock mechanism provides a strong control mechanism, implementing auditing and monitoring on the enforcement points can help identify potential attempts to bypass restrictions or exploit weaknesses in the lock design.
Error Handling and Feedback: Define clear behavior and feedback mechanisms for when an environment attempts to perform an operation that is denied by its locks. Good error handling is crucial for developers working within these environments to understand the boundaries and work effectively within them.
Documentation and Transparency: Providing clear documentation on how locks work, how they are inherited, and how they restrict operations will aid in the adoption and correct usage of the environment creation and management functionalities you are designing.
In conclusion, your approach to using immutable, inheritable locks that enforce path-based access control offers a robust method to secure environments in a stack-based programming language, without necessitating the complexity and overhead associated with cryptographic solutions. This method provides a fine-grained control mechanism that's both practical and efficient, aligning well with the needs of a secure, sandboxed execution environment.
- TODO Fine grained access control
2.24.2. Discussion
2.24.2.1. pairs of program/dictionary.
It's sort of like a passage of english prose, and a dictionary of what all the words used in that prose mean. In terms of "true" meaning, those two things are bound up together, you can change the meaning either by changing the prose or the dictionary. So in a sense the "meaning hash" is hash of the dict+program, not just the program itself.
One potential hole in this concept is the existence of axiom words (that aren't defined in terms of other words). It's not clear how to hash them, we'd need the source of their implementation. Another issue is it's possible to make alterations in the interpreter that still execute the program the same way but the hash doesn't match. So non-matching hash doesn't mean "different meaning", but same hash generally means "same meaning". That's the way hashes work most places though.
Can we do anything with this? Maybe not.
2.24.2.2. not having to recalculate the whole dictionary each time we want to use a module.
There's several possible mitigations:
- Leave the modified dictionary on the stack so it can be reused
Make 2 levels of env nesting per library
- One where the modified dict is in the dict (but words are not available)
- One where we actually apply that dict so the words can be used
It's possible to load all the libraries first, in a single environment and then go from there. However it's inherently nested - any library might have dependencies. So who loads them? We can't load them right at execution time because that would happen repeatedly.
Somehow we need the module that updates the dictionary to pull in its own dependencies. But how can it do that, when its dependencies aren't loaded and don't have names?
eg, crypto -> hash -> bytes. If crypto depends on hash in some other module where do we fetch it from?
Can we include the loading of dependencies in the module itself? In other words, include the changes from the other module in this one? I think that might be possible. However if the intention is to refer to the dependency by name like in other languages that might be more difficult. Perhaps we can start by breaking out stdlib modules and having the standard env refer to them as dictionary entries.
However how would dependency loading work then? Let's say we have
[[foomod [[[foo ["foo"]]] draft join]] [barmod [foomod join [[bar ["foo"]]] join]]]
That could work: if both
foomod
andbarmod
are in the dictionary then we can havebarmod
refer tofoomod
.This doesn't quite solve the problem that
let
should solve though. We really just want a little supplemental dictionary for the duration of a program, but we want it "compiled in" so that the program can be passed around and run many times without having to do the setup work again.so what we want is to pass around the program+dictionary. But doing it as an "environment" is maybe not quite what we want because the stack is not permanent. So perhaps what we need is a word that takes an environment and executes it as if it's just a program (inheriting the stack). Then the actual program is the call to that word.
;; the stack [5 6 7 8 9] ;; let's create an inner env, all the compile time stuff [[square [clone *]]] draft inscribe [square] spawn wrap ;; now let's call this with inherited stack at runtime [[snapshot] dive [stack] swap assign ;; all set evaluate [stack] lookup restore] join ;map
[[[dictionary dictionary_redacted] [program [square]] [stack [[5 6 7 8 9]]]] [snapshot] dive [stack] swap assign evaluate [stack] lookup restore] [5 6 7 8 9]
What we need then is a way to access the "compiled" program repeatedly, with different stack data each time.
Also, this binds a dictionary to a single program which is fine for
let
but we also need a module case where we make the dictionary available via a word.What about cases where we want to load more than one module at a time? Can we save that dictionary too? Do we need to? I don't know why we'd need to call
using
in a tight loop,using
could just go around the loop.Let's just try some stuff. Create a module and cache it
[[square [clone *]] [cube [clone square *]]] draft encode [] cache
#b64 "Tz9VhU5ISws4N7I7ckTcKBEpjCGHp5Svc1O7t7JRWX4="
ok now we want to use this module
#b64 "Tz9VhU5ISws4N7I7ckTcKBEpjCGHp5Svc1O7t7JRWX4=" decache string read first inscribe [9 cube]
[9 cube] dictionary_redacted
2.24.2.3. Separate manifest
If we look at other languages, usually there's a separate piece of data from the actual program: the build manifest. It specifies what versions of what libraries are to be loaded. Then later in the code, the libraries are brought in but there's no mention of versions or where the package came from.
kcats could have a similar mechanism, but since there's no separate build tool (yet, eventually will need something to at least fetch remote libraries), we can have a sort of prelude section to a program where we specify all the hashes of libs we want to use in the program. By that point, presumably they are already in the cache.
This brings up the question of how dependencies are specified. The naive approach would be for the code of the module itself to have its own prelude (which is pretty much how other languages work). However that leaves the issue of how the fetch tool will know what it needs to fetch. An obvious method is via convention, that the first thing in the module is the dependencies, which we can read the hashes and go download them. Another is to just load the library in the tool, so then the convention is not needed (other things can come before the hashes). The tool would somehow interpret "loading from cache" as "load from cache and go to the network if it's not there".
2.24.2.4. Use of names
It occurred to me that we don't have to use names at all when it comes to libaries or modules. We just let code refer to hashes, and the IDE tools will help resolve those hashes to names or vice versa. But the canonical form (in the actual code) will be hashes only.
This solves a lot of issues:
- There is no chicken and egg problem. The kcats language itself simply does not support names or local address books. It refers to content by hash and that's it. That's how libraries are loaded, etc. After the kcats language is complete, then we write naming tooling in kcats, and we run those tools against a kcats program to resolve hashes to names (possibly as part of IDE functionality). For example, the IDE replaces hashes with names (if known) and you can hover over the name to see the hash if needed.
- Code is sharable because there are no local names present
;; define a module [[square [clone *]]] draft inscribe ;; now the dictionary is on the stack [9 square] confine
81
Defining a module hierarchy using only the stack can get diffcult. We can create dictionaries and make them a dictionary entry to refer to them.
[[square [clone *]]] draft inscribe ;; so now we have an updated dictionary, let's make this an entry we ;; can access without polluting the dictionary with all the words we defined wrap [math] swap put entry dictionary swap put [math [8 square] confine] confine
64
So how would we make a hierarchy with this mechanism? For example the
db has a hierarchy. Maybe make some words to help. Let's say we have a
that depends on b
and c
. We can create those dictionaries separately
and then make the dependencies available as words. So the environment
that runs a
has entries for b
and c
dictionaries. So when it needs to
call b/c words, it calls them with confine. Does it makes sense to
keep calling confine or just merge all the necessary words at the
beginning of the program? How do we not end up with either calling
confine just for one word, or just using one giant dictionary? I don't
know. Neither makes any sense, this whole design is crap. It's
basically just horrible ad-hoc namespaces, where the inner dictionary
is its own namespace.
;; using: takes a list of words that point to dictionaries and merges them
2.24.2.5. How to implement let
It's something most languages just don't have - functions scoped to a function. I mean, they do but there is no way to avoid having them redefined each time.
Can we do better?
So I guess what we want is to take a program that contains local functions and an outer program, and transform that into a program that contains an altered dictionary.
Maybe let just transforms the program without executing it?
[[plus2 [2 +]]] draft dictionary swap [emit encode hashbytes] shield [shield] dip sink [dictmerge] shielddeep
dictionary_redacted #b64 "g5nJOWpyglIeN2EgJOdFRVQ0ix76q42bTs2uG5w5J/s="
This gives us basically a closure
3 4 [[plus2 [2 +]] [stuff [plus2 3 *]]] [stuff] [draft dictionary swap [emit encode hashbytes] shield [shield] dip sink [dictmerge] shielddeep] dip ;; under the let program ;; prog dict hash [wrap] dipdown ;; wrap the hash to make a list of 1 namespace [program dictionary resolver] label environment ;; creates closure ;; now execute the closure by capturing outer stack [stack] [snapshot] divedown assign evaluate [stack] lookup restore
18 3
So I guess the challenge then is to be able to reuse it (eg inside map), which we can this way:
1 10 1 range ;; the inner functions [[plus2 [2 +]] [stuff [plus2 3 *]]] ;; program to run that uses them [stuff] [draft dictionary swap [emit encode hashbytes] shield [shield] dip sink [dictmerge] shielddeep] dip ;; under the let program ;; prog dict hash [wrap] dipdown ;; wrap the hash to make a list of 1 namespace [program dictionary resolver] label environment ;; creates closure ;; when the closure executes, capture the outer stack first wrap [[stack] [snapshot] divedown assign evaluate [stack] lookup restore] join map
[9 12 15 18 21 24 27 30 33]
2.24.2.6. Sandboxing
In the context of how the rest of kcats works, I think sandboxing should work as follows:
As always, when altering the dictionary, we can only do it in a new environment. Since environments and dictionaries are first class objects, it will always be possible to construct a dictionary with any word whose value is reachable from the current environment. Therefore in order to make a secure sandbox, we have to remove the values that we don't want used so that they are truly unreachable. That means overwriting words or deleting them from the dictionary and then creating a new environment from that reduced dictionary.
As mentioned before, this is mostly about axiom words (that do things like access network etc). There isn't much point in obscuring derived words, since it's always possible to reconstruct it from axiom words.
2.24.3. Implementation
2.24.3.1. TODO take a dictionary and a program and execute the program with that dict
2.24.3.2. TODO take a mapping of name to module and return a dictionary with those modules
[[square [clone *]] [cube [clone square *]]] draft [[foo ["foo"]]] draft [foo math] label draft inscribe [[foo math] [9 square foo] using] confine
"foo" 81
2.24.3.3. TODO One module depends on another, loads it
So, how exactly should this work?
How does one module ensure another is loaded before this one is used?
[[square [clone *]] [cube [clone square *]]] draft [[sixth [square cube]]] draft [foo math] label draft inscribe [[foo math] [9 sixth] using] confine
[[actual foo] [asked [sized]] [handled yes] [reason "type mismatch"] [type error] [unwound [get [[definition]] unwrap [something?] shield [take swap [get] dip [something?] shield] loop drop [[[math definition] [foo definition [sixth [[definition [square cube]]]]] dictionary_redacted]] unwrap evert first shielddown [] [[definition] unwrap pair [lookup] shield shielddown] step [[9 sixth]] unwrap confine]]] math [foo definition [sixth [[definition [square cube]]]]] dictionary_redacted
So here we have modules that depend on each other, but the actual
dependency is not expressed here. We have to manually load math
even
though we are only calling foo
.
At what point do we load the dep and how? Inside the module code we do have access to the dictionary as input.
Let's list the steps:
- retrieve dictionary
- "Main" - load prelude
- prelude says hashA
- decache, read hashA -> moduleA
- execute moduleA
- moduleAprelude empty (no deps)
- execute rest of moduleA to modify dict
Re below: does the math module need to be available as a module or
just merged with the dictionary or both? If we just make it a module,
we need to put using
in the impl of sixth
which would be very slow if
called repeatedly. Some kind of caching might be needed.
I think perhaps the original model of a single namespaced dictionary
might be best, such that we don't have to repeatedly apply changes -
we just do it at library loading time and then the entire execution
uses the same dictionary. During executing we can change how words are
resolved via using
. But the difference between this and the previous
implementation of namespaces, is that resolution is altered for the
entire execution of a program and not just the words within that
program. That should fix the sandboxing security hole that existed in
that implementation. So we can have a word depend
that loads libraries
into their own namespace (including deletions) (and assigns them
words), and then using
(or use
) that alters name resolution for the
duration of a program.
;; then the outer mod [[[math #b64 "ULY02dWGqy3G7x9Hd7RgBG2q+Dw9RE8hC4dUjzyCqRk="]] ;; load each dep module [unwrap module [wrap] dip entry assign] step ;; this makes the modules available as a package, ;; also need to integrate them into this dictionary [[sixth [square cube]]] draft execute] encode [] cache
#b64 "QUsvdqQ6E46RsswGPAAlbz7tTWpdefkXPA5NByxGr4c="
;; first cache the inner mod [[[square [clone *]] [cube [clone square *]]] draft] encode [] cache ;; then the outer mod [[[[math #b64 "ULY02dWGqy3G7x9Hd7RgBG2q+Dw9RE8hC4dUjzyCqRk="]] ;; load each dep module [unwrap module [wrap] dip assign] step] [[sixth [square cube]]] draft join] encode [] cache dictionary ;; outer mod ;; first the prelude [[foo #b64 "QUsvdqQ6E46RsswGPAAlbz7tTWpdefkXPA5NByxGr4c="]] ;; load each dep module [unwrap module [wrap] dip entry assign] step ;[[foo] lookup] shield dump drop [[foo] [9 sixth] [dictionary swap [wrap shielddown] step] dip confine] confine ;; outer confine creates a dictionary with all the module words
[[foo] [9 sixth] [dictionary swap [wrap shielddown] step] dip confine] dictionary_redacted #b64 "rJDrBYlO0RF3MHLh9tisE6kvRpGdVqtcKaWm0VyG6CQ=" #b64 "i4VyOtSDZ8aKk1YkF4scKaml+ULiClqGdeG3D8NST30="
2 3 [3 2] shielddown
2 2
2.24.3.4. TODO Revert back to namespaces
[[square [clone *]]] draft [emit] map ;encode ;[hashbytes] shield swap string read ;[dictionary clone] dip shielddown
["[[square [[definition [clone *]]]]]" "join"]
Let's figure out how to implement inscribe
without the caller having
to provide a hash (which is a security flaw anyway). Do we require
going through the cache all the time or do we round trip serialize so
we can support inline modules? I lean toward the latter, let's try that first.
[[plus2 [2 +]]] draft dictionary swap [emit encode hashbytes] shield [shield] dip sink dictmerge ;; [first [plus2] unwrap =] filter ;; to check if the word is there [#b64 "g5nJOWpyglIeN2EgJOdFRVQ0ix76q42bTs2uG5w5J/s="] [5 plus2] [program resolver dictionary] label environment evaluate
[[dictionary dictionary_redacted] [program []] [stack [7]]]
Ok now let's refactor this into the form we want for let
.
[[plus2 [2 +]] [stuff [plus2 3 *]]] [5 stuff] [draft dictionary swap [emit encode hashbytes] shield [shield] dip sink [dictmerge] shielddeep] dip ;; under the let program ;; prog dict hash [wrap] dipdown ;; wrap the hash to make a list of 1 namespace [program dictionary resolver] label environment evaluate [stack] lookup restore
21
Ok seems like we have a reasonable definition of let
so what would a
nested call look like? Would we even do this or just use actual modules?
Let's say some local function has its own local function.
[[cube [clone clone * *]]] [2 + cube] let [plus2cube] label [4 plus2cube] let execute
216
Now let's make sure the nesting has proper priority if there are conflicts.
[[foo ["outer"]]] ["inner"] let [foo] label [foo] let execute
"inner"
2.24.3.5. TODO Fix partition module logic
The problem is that let
should modify the given program to wrap in a
'using', so that the module is defined once and then the program can
be run many times even with different dictionaries in different
environments - the 'using' will just add one more module to the resolver.
2.24.4. INPROGRESS inscribe currently re-defines words repeatedly at runtime
- State "INPROGRESS" from "TODO"
2.24.4.1. INPROGRESS Current design
- State "INPROGRESS" from "TODO"
Have two separate words: one for loading new modules loadlib
, and
other using
for activating them in a given program.
Using hashes to refer to modules or namespaces is secure, but hard to read. We can use aliases, but we need to be careful about which aliases we use - should we trust an alias created by a module we loaded? Can we overwrite aliases?
Attack scenario: we load a module foo, and it creates an alias bar and then we later assume bar refers to something else, and call its 'quux' word.
There are several different loading mechanisms which is what makes this functionality difficult:
- defining a module inline (we provide the bytes and perhaps also an alias)
- loading a stdlib module into the default namespace
- loading a stdlib module into its own namespace
- loading an externally-downloaded module and giving it an alias
I think for the last case, if you load a module and give it an alias simultaneously, I don't see how an attacker can get you, as long as it's an error trying to overwrite an alias. If the alias already existed, you get an error, and if not, you're guaranteed the content can't change after that point. It's only dangerous when you try to use an alias you never attempted to create yourself.
This may at least mitigate RCE attacks, but it does still leave the problem of aliases potentially colliding and those collisions being hard to predict. I don't know that this is a problem unique to this language though.
In terms of implementation details, we could leave it mostly as-is except we need the ability to
- load a library using an predefined alias (for stdlibs that aren't loaded by default)
- load libraries into the default (no) namespace
So how do we implement that? inscribe
doesn't know or care about where
the content comes from, so we need a way of fetching from the
cache. We can check the mapping of alias->hash and if it exists,
verify a match, and if it doesn't exist, create the mapping. Since the
mapping will exist before calling inscribe
, inscribe
needs a way of
not creating the alias - we could do that by allowing []
as an alias.
2.24.4.2. INPROGRESS Library loading
- State "INPROGRESS" from "TODO"
- TODO Make library loading primitives
- DONE Move hash to earlier loading module
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
We need it for library loading, can't stay as part of crypto lib.
- DONE Primitive for namespacing a word
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
- DONE Primitive for loading a blob from cache via hash or alias
- State "DONE" from "INPROGRESS"
State "INPROGRESS" from "TODO"
This verifies the module being loaded from cache matches the builtin hash.
[crypto] clone unwrap decache swap "modules" encodestring hashbytes swap unwrap namespaced wrap [definition] join dictionary swap lookup first swap hashbytes =
yes
this just inscribes the module
[pipes] clone unwrap decache string read ;;inscribe ;;[crypto] ["foo" hash] using
[[pipe-in [[spec [[item] [pipe]]] [definition [association [[[type [file] unwrap =] [value file-in]] [[type [stdout] unwrap =] [stdout]]] decide]]]] [tunnel [[spec [[item] [pipe]]] [definition [association [[[type [ip-host] unwrap =] [clone [port] lookup [[address] lookup] dip serversocket]] [[type [ip-client] unwrap =] [clone [port] lookup [[address] lookup] dip socket]]] decide]]]] [pipe-out [[spec [[item] [pipe]]] [definition [association [[[type [file] unwrap =] [value file-out]] [[type [ip-host] unwrap =] [clone [port] lookup [[address] lookup] dip serversocket]]] decide]]]] [spit [[spec [[item [item target]] []]] [definition [[pipe-in] dip encode put drop]]]] [slurp [[spec [[pipe] [item]]] [definition [[take] [join] fold string [drop drop] dip]]]] [print [[spec [[string] []]] [definition [[standard] dip "\n" join encode put drop]]]] [sleep [[spec [[integer] []]] [definition [timer take drop drop]]]] [future [[spec [[program] [pipe]]] [definition [handoff swap [snapshot] join wrap [dive put drop] join spawn animate]] [examples [[[1 [2 +] future take dropdown] [1 [3]]]]]]] [generator [[spec [[[program generator-maker]] [[program wrapped-generator]]]] [definition [[] swap inject [[generate] inject take]]]]] [siphon [[spec [[[receptacle output] [program generator]] [[receptacle output]]]] [description "Generates values from a wrapped generator (stacked generator inside a list), until exhausted, puts all items into the output receptacle"] [definition [[] [empty?] [drop [generate clone] dip sink [[put] bail] dip] until drop drop sink drop drop]] [examples [[[[[integers 5 taker] generator [] siphon] shield] [[0 1 2 3 4]]]]]]] [close [[spec [[pipe] []]] [definition [drop]]]]] [pipes] [[actual close] [asked [sized]] [handled yes] [reason "type mismatch"] [type error] [unwound [dictmerge [generators] [decache [] swap inscribe] step]]] []
- CANCELED Primitive for aliasing a module
- State "CANCELED" from "TODO"
Not going to use aliases.
Needs to respect the "no overwrite" rule.
- State "CANCELED" from "TODO"
- TODO Test out saving modules as dict entries
We can think of a module as a dictionary modification program. We can store those programs in the dictionary as definitions and use them later? No not really, because we can't modify the current dictionary. So there's no way to just install an alias to a module.
"123" encode [crypto] stdmod [hash] boomerang
#b64 "pmWkWSBCL51Bfkhn79xPuKBKHz//H6B+mY6G9/eieuM="
Ok so this is straight forward enough but what if we want to use a module later? Do we need to reload it? We could save dictionaries but that's not the same as saving the modules that comprise it so that they can be combined in different ways.
In most languages, a module continues to be addressable by an alias after you've loaded it (and doesn't result in loading it again).
A couple ways to deal with that - can store a diff instead of the module program and just apply it. Can just reload the module every time. Can keep the resulting dictionary somewhere (on the stack?).
Kcats is just function composition, I don't think there's any case where more than one function needs to be available at a time. Dividing the dictionary seems to be more of a security feature - you're saying "this program should only need these functions but I don't know what the lower levels try to do". In theory you could give different dictionaries to every word of the program that consisted of all the words that word calls, etc. So this whole system is more of a "matching my expectations to the actual behavior" type of feature.
- TODO Test out nested envs as library mechanism
;; load a module [crypto] unwrap decache string read [["foo" "bar" "baz"] [hash] map] spawn clone evaluate ;;boomerang
[[program []] [stack [[#b64 "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564=" #b64 "/N4rLtula/QIYB+3If6bXDONEO5CnqBPrlURto+/j7k=" #b64 "uqWglk0zIPvAxqkiFARTyFE+okq4/QV3A0gEqWckgJY="]]]] [[program [["foo" "bar" "baz"] [hash] map]] [stack []]]
So what is
let
in this model?[[plus2 [2 +]]] [5 plus2] [draft] dip boomerang
7
- DONE Move hash to earlier loading module
2.24.4.3. INPROGRESS Nesting scopes
- State "INPROGRESS" from "TODO"
We should be able to chain calls to using
without repeating any
expensive calls.
That means we need a word that only modifies a program by resolving
the words in it (could call it resolve
?)
test overriding behavior
[[+ ["oo"]]] ["a" "b" +] let
Resolved: Word { data: 0x55b105c8ce50 : "+", namespace: Some([95, 8, 227, 46, 226, 44, 177, 198, 142, 175, 56, 167, 55, 63, 227, 254, 126, 182, 160, 134, 28, 68, 164, 57, 208, 15, 103, 59, 84, 173, 128, 139]) } ["oo"] "b" "a"
The issue is we're updating the definition but we end up keeping the old spec and examples. we need to clear the whole entry.
[foo] [[swap [[6]]]] [[0] [wrap] update ;; wrap the word name to get a path to update [1] [[definition] label] update ] map [[update] join] map [joiner] assemble unwrap inscribe wrap [swap] using
[[actual [[definition [[6]]]]] [asked [program]] [handled yes] [reason "type mismatch"] [type error] [unwound [update dictmerge wrap [swap] using]]] [[definition [[6]]]] [swap] dictionary_redacted #b64 "DFyCLJSxw6T5kHmcZ0+a4jZT0gNq29vn/BE9oKcRSTU=" foo
ok this is too rigid, let's get rid of revise/draft in their current
terms and make words that bulld what inscribe
needs.
After some internal debate, i think it's best to have inscribe take a single update program - rationale is that there's no need at that point to treat them separately, that can be done before.
So inscribe is currently correct, we need to … revise revise
. First
thing is a function that takes a list of word updates and translates
to a single dictionary update.
[foo] [[swap [[[definition [6]] [spec [[] [number]]]]]]] [[take] [[0] [wrap] update ;; wrap the word name to get a path to update [update] join] each joiner generate] shielddown inscribe wrap [swap] using
Resolved: Word { data: 0x56294637f420 : "swap", namespace: Some([59, 133, 203, 149, 9, 0, 17, 144, 86, 83, 103, 44, 36, 226, 184, 25, 62, 38, 28, 127, 173, 154, 55, 144, 71, 243, 173, 235, 59, 37, 10, 18]) } 6
Now we need a shortcut for when we don't want to specify the whole entry, just the definition
[foo] [[swap [6]]] [entry] map updates inscribe wrap [swap] using
6
Test draft
[[swap [6]]] draft [swap] let
6
[foo] [[bar ["hi"]]] draft inscribe
foo
Ok this is all seems to be in order, there's now a problem in the
stdlib where a word calls let
and defines some local functions, eg
partition
. What we actually need is to do resolve
as the stdlib is
being built, perhaps defining partition
in its own little module, and
then resolving the module once. Something like this:
;; functions partition uses [partition] [[take [[taker collect dropdown dropdown] ; drop the used-up taker generator join divedeep]] [shift [[[count <=] [swap 0 slice] [[]] if] shield swap]] ] [entry] map [partition] [[spec [[] [program]]] [definition [[] ;; state ;; the generator [[[] [over wrap take [join shift] bail] [[over] dive wrap take swap drop shift] if]]]]] assign updates inscribe
2.24.4.4. TODO Stack escape protection
If a program refers to a word, and at the time that program is put on
the stack, that word means something, it should still carry the same
meaning if that program is later run with execute
. That means that
module changes must be permanent.
2.24.4.5. INPROGRESS Sandboxing support
- State "INPROGRESS" from "TODO"
It must be possible for a module to deny access to a given word. Given that dictionary changes are permanent, we can't just delete words from the dictionary (once the given program is done, that word needs to be available again somehow).
We can implement this with some sort of shadowing mechanism during resolving: If we "delete" a word, we could actually define the word in the module's namespace, such that all it does is throw a 'no such word' error. That's one somewhat hacky way to implement it, but there may be others.
- There should be no way for code using the module to access the "deleted" word. (Check for escape hatches via arbitrary dictionary modification)
- The word should be accessible again after the program using the module has completed.
- TODO experiment with nested envs as module loading mechanism
First let's see if a nested env allows sandboxing
;; make an env with no access to io ;; load the functional module code [functional] unwrap decache string read ;; get the current dictionary and modify it [dictionary] dip execute ;; Now create a new env with this dictionary and execute it ["hi" print] [program dictionary] label environment evaluate
[[program []] [stack [[[asked [standard]] [handled yes] [reason "word is not defined"] [type error] [unwound [standard ["hi"] unwrap "\n" join encode put drop]]]]]]
Yes!
Ok now let's make a word that takes a module and a program, builds a new env.
"foo" [bar] [[plus2 [2 +]]] [entry] map wrap [join] join [5 plus2] spawn ;dictionary ;float shielddown [dictionary program] label environment evaluate [stack] lookup restore
7 [bar] "foo"
Also, for debugging purposes it would seem we need a way of running eval-step (outer) in a loop - perhaps an axiom that checks if the program is empty, if not eval-step it and place eval-step back in the program? Slightly better than a
loop
impl. Maybe it's ok for now.[[program [1 2 3 +]]] environment eval-step
[[program [2 3 +]] [stack [1]]]
Let's see if it's possible to use an inner-env module and still debug it.
;; the program we want to debug in the top level [10 11 [[plus2 [2 +]]] draft inscribe [plus2] confine] ;; TODO: the issue here is that we can't just eval-step once, We have ;; to make the next instruction also eval-step (unless the program is ;; already empty) [[evaluate definition] ;; The inner env debugger - when we call `confine`, it's going to ;; call `evaluate`, which will not allow us to step through the inner ;; env's execution. In order to do that, we have to just eval-step ;; it. Instead of running evaluate as a single atomic word we just ;; make a kcats version of it. [[[program] lookup] [eval-step] while] assign] inscribe swap spawn
[[asked [consume]] [handled yes] [reason "not enough items on stack"] [type error] [unwound [confine]]] [[dictionary dictionary_redacted] [program [10 11 [[plus2 [2 +]]] draft inscribe [plus2] confine]] [stack []]]
;; the program we want to debug in the top level 10 11 [[plus2 [2 +]]] draft inscribe [plus2] confine
13 10
2.24.4.6. INPROGRESS Access control
- State "INPROGRESS" from "TODO"
Take the word define
which allows the caller to make arbitrary and
permanent changes to the dictionary. What if we wanted to restrict
access to that word such that authorized programs can call it but
others can't?
2.24.4.7. INPROGRESS Words can refer to other words in the same library
- State "INPROGRESS" from "TODO"
2.24.4.8. TODO Convenient module definition
We need a word that takes care of the common case: we want to define a
set of vocabulary, it's all additive, and some of the words refer to
each other. Previously we called that draft
. Here's what the new draft
could look like, it breaks down into generating dictionary updates:
[[square] [[clone *] [[number] [number]] [spec definition] label] update [cube] [[clone square *] [[number] [number]] [spec definition] label] update] encode [math] unwrap inscribe
So we want the user to be able to write this:
[[square [clone *]] [cube [clone square *]]] [math] [[[0] [wrap] update [1] [[definition] label wrap] update] map [[update] join] map [joiner] assemble unwrap encode] dip ;; under the module name unwrap inscribe [math] [6 cube] using
216
can we recurse?
[[square [clone *]] [cube [clone square *]] [factorial [[swap positive?] [[*] shielddown [dec] dip factorial] when]]] [math] [[[0] [wrap] update [1] [[definition] label wrap] update] map [[update] join] map [joiner] assemble unwrap encode] dip ;; under the module name unwrap ; inscribe ;[math] [6 1 factorial dropdown] resolve [2] lookup inspect
720
let's see if we can make revise first
[f] [[fff ["foo"]]] draft [hh] [[hash [[type [foo] unwrap =] [drop fff hash] addmethod [f] swap resolve]]] revise [hh] [[[foo myfoo]] association hash] using "foo" hash =
yes
[ff] [[f ["foo"]]] draft [gg] [[g [[ff] [f] resolve]]] revise [gg] [g] using
"foo"
lets see if the draft in terms of revise works
[math] [[square [clone *]] [cube [clone square *]] [factorial [[swap positive?] [[*] shielddown [dec] dip factorial] when]]] draft [math] [6 1 factorial dropdown] using
720
Test if we can use resolve before a draft, to refer to existing modules within a module. Now this works because inscribe can take either serialized bytes or a parsed datastructure. If we pass the latter, it can have words already resolved.
[innermodule] [[innerfn ["foo"]]] draft [outermodule] [innermodule] [[outerfn [innerfn]]] resolve draft [outermodule] [outerfn] using
"foo"
[[program [[innermodule] [[innerfn ["foo"]]] draft [outermodule] [[outerfn [[innermodule] [innerfn] resolve]]] revise [outermodule] [outerfn] using ]]] environment [advance] 10 times
[[stack ["foo"]] [program []]]
[[fiver [5 +]]] [12 fiver] [[0] swap ;; add an alias (0 means don't bother with creating the alias) [entry] map ;; create full entries for each definition wrap [join] join ;; add 'join' to join the entries with the existing dictionary inscribe wrap] dip ;; update dict and then wrap the hash as the module to be used using ;; execute the program
17
2.24.4.9. TODO convenient 'let'
We want define a module and use it inline without having to worry about its alias.
Let's see if we can implement partition
this way.
;; the module [[square [clone *]] [cube [clone square *]]] [9 cube] [] sink [draft wrap] dip using ; [ ; [[] ; [over wrap take [join shift] bail] ; [[over] dive wrap take swap drop shift] ; if] ; draft]
729
[[] [[foo [1]]] [[1] [wrap] update] map revise] execute ;tracer 100 taker collect
[] [[foo [1]]] [[1] [wrap] update] map
[[foo [[1]]]] []
2.24.4.10. INPROGRESS Break up the standard library
- State "INPROGRESS" from "TODO"
Question: how do we break it up? For example, does io
refer to all I/O
operations or just the base stuff that they all depend on?
Some areas of functionality that already exist:
- Debugger
- Generators
- Pipes
- File
- Network
- Channel
- Encoding
- Associations
- Collections
There seem to be 3 types of modules:
- Stuff that's part of the core functionality (can't do basic language stuff without it, should be part of the binary). Could be up to and including what's needed to call inscribe/using, which allows callers to load their own modules.
- Not core but can still be in the default namespace because it likely doesn't collide with other stuff (io, nested env etc)
- Stuff that's not used often enough to be in default namespace, or is easy enough to refer in when needed (crypto, debugger)
The 2nd type, we can now use "inscribe" and make them normal modules, but then they won't be in the default namespace. We can at least leave them out of the binary.
- INPROGRESS Figure out where to put modules on disk
- State "INPROGRESS" from "TODO"
This will make the build and packaging more complex as it will have to include other files besides the binary and deal with platform-specific issues. The standard env could, however, still load a bunch of libs, but maybe this could be overridden with cmdline args.
[[program [[foo] clone unwrap decache inscribe]]] environment [eval-step clone] collect
Warning, failed to insert into dictionary: Dispenser(Sized(List([Int(5), Word(Word { data: 0x55782137afc0 : "+", namespace: None })]))) Warning: empty local module [[[program [clone unwrap decache inscribe]] [stack [[foo]]]] [[program [unwrap decache inscribe]] [stack [[foo] [foo]]]] [[program [decache inscribe]] [stack [foo [foo]]]] [[program [inscribe]] [stack [#b64 "W1tiYXJdIFs1ICtdIGFzc2lnbl0=" [foo]]]] [[program [[[bar] [5 +] assign] shielddown dictmerge]] [stack [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [shielddown dictmerge]] [stack [[[bar] [5 +] assign] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [shield dropdown dictmerge]] [stack [[[bar] [5 +] assign] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [[snapshot] dip inject first dropdown dictmerge]] [stack [[[bar] [5 +] assign] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [dip inject first dropdown dictmerge]] [stack [[snapshot] [[bar] [5 +] assign] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [snapshot [[[bar] [5 +] assign]] unwrap inject first dropdown dictmerge]] [stack [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [[] evert clone evert unwrap [[[bar] [5 +] assign]] unwrap inject first dropdown dictmerge]] [stack [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [evert clone evert unwrap [[[bar] [5 +] assign]] unwrap inject first dropdown dictmerge]] [stack [[] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [clone evert unwrap [[[bar] [5 +] assign]] unwrap inject first dropdown dictmerge]] [stack [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]]] [[program [evert unwrap [[[bar] [5 +] assign]] unwrap inject first dropdown dictmerge]] [stack [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]]] [[program [unwrap [[[bar] [5 +] assign]] unwrap inject first dropdown dictmerge]] [stack [[[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [[[[bar] [5 +] assign]] unwrap inject first dropdown dictmerge]] [stack [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [unwrap inject first dropdown dictmerge]] [stack [[[[bar] [5 +] assign]] [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [inject first dropdown dictmerge]] [stack [[[bar] [5 +] assign] [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [swap evert take dip evert first dropdown dictmerge]] [stack [[[bar] [5 +] assign] [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [evert take dip evert first dropdown dictmerge]] [stack [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] [[bar] [5 +] assign] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [take dip evert first dropdown dictmerge]] [stack [[[[bar] [5 +] assign] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [dip evert first dropdown dictmerge]] [stack [[[bar] [5 +] assign] [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [[bar] [5 +] assign [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]] unwrap evert first dropdown dictmerge]] [stack [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [[5 +] assign [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]] unwrap evert first dropdown dictmerge]] [stack [[bar] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [assign [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]] unwrap evert first dropdown dictmerge]] [stack [[5 +] [bar] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [[[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]] unwrap evert first dropdown dictmerge]] [stack [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [unwrap evert first dropdown dictmerge]] [stack [[[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [evert first dropdown dictmerge]] [stack [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [first dropdown dictmerge]] [stack [[dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo] dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [dropdown dictmerge]] [stack [dictionary_redacted dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [swap drop dictmerge]] [stack [dictionary_redacted dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [drop dictmerge]] [stack [dictionary_redacted dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program [dictmerge]] [stack [dictionary_redacted #b64 "mz5w5sBIFt1413HhyrFOWKaa8MhJFlgZE/PeBAiaz40=" foo]]] [[program []] [stack [foo]]]] [eval-step clone] []
[crypto-builtins] clone unwrap decache string read [[1] [wrap] update] map ;; wrap the definition so update leaves the literal value updates inscribe wrap ;; the module alias ["foobar" encode hashbytes] using
#b64 "Zm9vYmFy"
[foo] unwrap [[baryyyy] [5 +] assign] inscribe wrap [6 baryyyy] using
Warning, failed to insert into dictionary: Dispenser(Sized(List([Int(5), Word(Word { data: 0x55a8be888030 : "+", namespace: None })]))) Warning: empty local module [[asked [baryyyy]] [handled yes] [reason "word is not defined"] [type error] [unwound [baryyyy]]] 6
[foo] [bar] resolve
[[actual foo] [asked [resolve]] [handled yes] [reason "module not found"] [type error] [unwound [resolve]]]
- INPROGRESS Move core libs back to using includebytes
- State "INPROGRESS" from "TODO"
- INPROGRESS Figure out what to do about builtins
- State "INPROGRESS" from "TODO"
- INPROGRESS Group functions
- State "INPROGRESS" from "TODO"
- INPROGRESS Determine module loading order
- State "INPROGRESS" from "TODO"
- stack ops builtins [drop clone evert]
- stack motion builtins [swap swapdown float sink]
- stack motion [flip] (only depends on own builtins)
- collection builtins [join put count first step take wrap unwrap …]
- execution builtins [dip execute branch recur loop decide]
2.24.4.11. CANCELED Disallow module alias overwriting
- State "CANCELED" from "TODO"
We're not going to be using aliases in this branch
It is a security problem to allow code to overwrite some other code's alias for a module. Throw an error in this case. We should call insert on the aliases, and if it returns something, it better be equal to what we inserted.
2.24.4.12. TODO Store data sources
When we load modules we will eventually be getting them from the network but for now the stdlib needs to be loaded from disk. So we could just always look there by default, i guess. But if we need to allow custom sources, we could add that alongside the aliases
2.24.4.14. TODO LIbrary loading should be in order of decreasing trust
We shouldn't start with a standard env with access to the filesystem. Instead the standard env should lack IO capability, and force that to be loaded in an inner env.
So instead of dropping back to a functional env, we start with one and load io capability when needed.
Questions:
- How do we decide what "less trusted" means?
- Does decreasing trust make sense?
- Does that mean we should never need to delete words? Seems like sometimes we might.
Decreasing trust seems to make sense, since you can't make a trustworthy construct out of an untrusted, only vice versa.
What makes an env less trusted than its parent? Is it the additional words? Is it just that any additional code requires more trust, since it needs to be vetted?
2.24.5. TODO Debugger needs special handling to work with nested environments
For example to collect a histogram of how many times each word was executed, we can pass the word up the env chain, but we need some extra words to deal with this.
; inner env [1 2 3 + +] stage wrap [evaluate [stack] lookup restore] join stage wrap [evaluate [stack] lookup restore] join execute
6
Let's compare evaluate
with the step-wise version
[0 1 100000 1 range [+] step] stage wrap [evaluate] join timestamps take float dipdown [take] dip -
66 [[from systemtime] [type out] [values [[type integer] [units milliseconds]]]] [[dictionary dictionary_redacted] [program []] [stack [4999950000]]]
Hm even with the rust impl of finished?
it's 1500ms vs 66ms, about 25x slower.
2.25. INPROGRESS Database
- State "INPROGRESS" from "TODO"
2.25.1. Books db
[[author-first "George"] [author-last "Orwell"] [title "1984"] [year 1949] [subjects [government dystopia surveillance totalitarianism freedom]]] [[author-first "Aldous"] [author-last "Huxley"] [title "Brave New World"] [year 1932] [subjects [society technology dystopia happiness drugs]]] [[author-first "F. Scott"] [author-last "Fitzgerald"] [title "The Great Gatsby"] [year 1925] [subjects [wealth love obsession american-dream tragedy]]] [[author-first "J.D."] [author-last "Salinger"] [title "The Catcher in the Rye"] [year 1951] [subjects [adolescence alienation innocence society adulthood]]] [[author-first "Jane"] [author-last "Austen"] [title "Pride and Prejudice"] [year 1813] [subjects [love marriage society class reputation]]] [[author-first "Mary"] [author-last "Shelley"] [title "Frankenstein"] [year 1818] [subjects [creation science responsibility monster humanity]]] [[author-first "John"] [author-last "Steinbeck"] [title "Of Mice and Men"] [year 1937] [subjects [friendship dream loneliness society tragedy]]] [[author-first "Ernest"] [author-last "Hemingway"] [title "The Old Man and the Sea"] [year 1952] [subjects [endurance nature old-age fisherman sea]]] [[author-first "Harper"] [author-last "Lee"] [title "To Kill a Mockingbird"] [year 1960] [subjects [racism innocence morality law childhood]]] [[author-first "J.R.R."] [author-last "Tolkien"] [title "The Lord of the Rings"] [year 1954] [subjects [adventure elf dwarf hobbit ring journey magic evil]]] [[author-first "Joseph"] [author-last "Conrad"] [title "Heart of Darkness"] [year 1899] [subjects [colonization africa journey morality darkness europeans]]] [[author-first "Leo"] [author-last "Tolstoy"] [title "War and Peace"] [year 1869] [subjects [war peace society history love aristocracy]]] [[author-first "Homer"] [title "The Odyssey"] [year -800] [subjects [journey odyssey homecoming gods heroism adventure]]] [[author-first "Charlotte"] [author-last "Bronte"] [title "Jane Eyre"] [year 1847] [subjects [love morality society class womanhood independence]]] [[author-first "Mark"] [author-last "Twain"] [title "Adventures of Huckleberry Finn"] [year 1884] [subjects [adventure racism slavery morality friendship river]]] [[author-first "Ray"] [author-last "Bradbury"] [title "Fahrenheit 451"] [year 1953] [subjects [censorship knowledge books society dystopia future]]] [[author-first "Charles"] [author-last "Dickens"] [title "A Tale of Two Cities"] [year 1859] [subjects [revolution love sacrifice resurrection society history]]] [[author-first "William"] [author-last "Golding"] [title "Lord of the Flies"] [year 1954] [subjects [society civilization savagery childhood morality island]]] [[author-first "Miguel de"] [author-last "Cervantes"] [title "Don Quixote"] [year 1605] [subjects [adventure idealism reality knight insanity literature]]] [[author-first "H.G."] [author-last "Wells"] [title "The War of the Worlds"] [year 1898] [subjects [invasion aliens society technology war humanity]]]
db [take] [[[type [book] unwrap =] [[publishYear] lookup 1940 >=]] [execute] every?] keep
Insert some data
[[[author-first "George"] [author-last "Orwell"] [title "1984"] [year 1949] [subjects [government dystopia surveillance totalitarianism freedom]]] [[author-first "Aldous"] [author-last "Huxley"] [title "Brave New World"] [year 1932] [subjects [society technology dystopia happiness drugs]]] [[author-first "F. Scott"] [author-last "Fitzgerald"] [title "The Great Gatsby"] [year 1925] [subjects [wealth love obsession american-dream tragedy]]] [[author-first "J.D."] [author-last "Salinger"] [title "The Catcher in the Rye"] [year 1951] [subjects [adolescence alienation innocence society adulthood]]] [[author-first "Jane"] [author-last "Austen"] [title "Pride and Prejudice"] [year 1813] [subjects [love marriage society class reputation]]] [[author-first "Mary"] [author-last "Shelley"] [title "Frankenstein"] [year 1818] [subjects [creation science responsibility monster humanity]]] [[author-first "John"] [author-last "Steinbeck"] [title "Of Mice and Men"] [year 1937] [subjects [friendship dream loneliness society tragedy]]] [[author-first "Ernest"] [author-last "Hemingway"] [title "The Old Man and the Sea"] [year 1952] [subjects [endurance nature old-age fisherman sea]]] [[author-first "Harper"] [author-last "Lee"] [title "To Kill a Mockingbird"] [year 1960] [subjects [racism innocence morality law childhood]]] [[author-first "J.R.R."] [author-last "Tolkien"] [title "The Lord of the Rings"] [year 1954] [subjects [adventure elf dwarf hobbit ring journey magic evil]]] [[author-first "Joseph"] [author-last "Conrad"] [title "Heart of Darkness"] [year 1899] [subjects [colonization africa journey morality darkness europeans]]] [[author-first "Leo"] [author-last "Tolstoy"] [title "War and Peace"] [year 1869] [subjects [war peace society history love aristocracy]]] [[author-first "Homer"] [title "The Odyssey"] [year -800] [subjects [journey odyssey homecoming gods heroism adventure]]] [[author-first "Charlotte"] [author-last "Bronte"] [title "Jane Eyre"] [year 1847] [subjects [love morality society class womanhood independence]]] [[author-first "Mark"] [author-last "Twain"] [title "Adventures of Huckleberry Finn"] [year 1884] [subjects [adventure racism slavery morality friendship river]]] [[author-first "Ray"] [author-last "Bradbury"] [title "Fahrenheit 451"] [year 1953] [subjects [censorship knowledge books society dystopia future]]] [[author-first "Charles"] [author-last "Dickens"] [title "A Tale of Two Cities"] [year 1859] [subjects [revolution love sacrifice resurrection society history]]] [[author-first "William"] [author-last "Golding"] [title "Lord of the Flies"] [year 1954] [subjects [society civilization savagery childhood morality island]]] [[author-first "Miguel de"] [author-last "Cervantes"] [title "Don Quixote"] [year 1605] [subjects [adventure idealism reality knight insanity literature]]] [[author-first "H.G."] [author-last "Wells"] [title "The War of the Worlds"] [year 1898] [subjects [invasion aliens society technology war humanity]]]] [[subjects] [set] update persist] step
[] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] []
[[[first-name "George"] [last-name "Orwell"] [country "United Kingdom"] [birth-year 1903] [death-year 1950] [sex m] [awards ["Nobel Prize in Literature"]]] [[first-name "Aldous"] [last-name "Huxley"] [country "United Kingdom"] [birth-year 1894] [death-year 1963] [sex m] [awards ["Humanitarian Award"]]] [[first-name "F. Scott"] [last-name "Fitzgerald"] [country "United States"] [birth-year 1896] [death-year 1940] [sex m] [awards ["Pulitzer Prize"]]] [[first-name "J.D."] [last-name "Salinger"] [country "United States"] [birth-year 1919] [death-year 2010] [sex m] [awards []]] [[first-name "Jane"] [last-name "Austen"] [country "United Kingdom"] [birth-year 1775] [death-year 1817] [sex f] [awards []]] [[first-name "Mary"] [last-name "Shelley"] [country "United Kingdom"] [birth-year 1797] [death-year 1851] [sex f] [awards []]] [[first-name "John"] [last-name "Steinbeck"] [country "United States"] [birth-year 1902] [death-year 1968] [sex m] [awards ["Nobel Prize in Literature"]]] [[first-name "Ernest"] [last-name "Hemingway"] [country "United States"] [birth-year 1899] [death-year 1961] [sex m] [awards ["Nobel Prize in Literature"]]] [[first-name "Harper"] [last-name "Lee"] [country "United States"] [birth-year 1926] [death-year 2016] [sex f] [awards ["Pulitzer Prize"]]] [[first-name "J.R.R."] [last-name "Tolkien"] [country "United Kingdom"] [birth-year 1892] [death-year 1973] [sex m] [awards []]] [[first-name "Joseph"] [last-name "Conrad"] [country "Poland"] [birth-year 1857] [death-year 1924] [sex m] [awards []]] [[first-name "Leo"] [last-name "Tolstoy"] [country "Russia"] [birth-year 1828] [death-year 1910] [sex m] [awards []]] [[first-name "Homer"] [last-name ""] [country "Greece"] [birth-year -800] [death-year -701] [sex m] [awards []]] [[first-name "Charlotte"] [last-name "Bronte"] [country "United Kingdom"] [birth-year 1816] [death-year 1855] [sex f] [awards []]] [[first-name "Mark"] [last-name "Twain"] [country "United States"] [birth-year 1835] [death-year 1910] [sex m] [awards []]] [[first-name "Ray"] [last-name "Bradbury"] [country "United States"] [birth-year 1920] [death-year 2012] [sex m] [awards []]] [[first-name "Charles"] [last-name "Dickens"] [country "United Kingdom"] [birth-year 1812] [death-year 1870] [sex m] [awards []]] [[first-name "William"] [last-name "Golding"] [country "United Kingdom"] [birth-year 1911] [death-year 1993] [sex m] [awards ["Nobel Prize in Literature"]]] [[first-name "Miguel de"] [last-name "Cervantes"] [country "Spain"] [birth-year 1547] [death-year 1616] [sex m] [awards []]] [[first-name "H.G."] [last-name "Wells"] [country "United Kingdom"] [birth-year 1866] [death-year 1946] [sex m] [awards []]]] [[awards] [set] update persist] step
[] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] []
2.25.2. Schema
Here's a database of the most basic and abstract observations (ignoring time and observer for now - those are all "now" and "me").
Down to a certain level, all attributes are also entities of their
own, but at some point they have to be either axiomatic or
circular. In the table below type
and format
are axiomatic.
Entity | Attribute | Value |
---|---|---|
Alice | street | 123 Fake St |
Alice | alice@alice.com | |
Alice | birthdate | 1/1/1980 |
street | type | attribute |
street | format | string |
type | attribute | |
format | string | |
birthdate | type | attribute |
birthdate | format | integer |
Alice | relationship | Bob |
Alice-Bob | type | relationship |
Alice-Bob | nature | friends |
Alice-Bob | trust | 5/10 |
Let's create an EAV table
"CREATE TABLE EAV ( Entity STRING, Attribute STRING, Value );" [] database
[]
"DROP TABLE EAV;" [] database
[]
Create indices (should really do this for all the columns)
"CREATE INDEX idx_entity ON EAV (Entity); CREATE INDEX idx_attribute ON EAV (Attribute); CREATE INDEX idx_valuestring ON EAV (ValueString); " [] database
[]
[[name "Butters"] [type dog] [weight 59.1] [breed [shepherd labrador]]] [breed] [set] update persist
[]
[[name "Butters"] [type dog] [weight 59.1] [breed [shepherd labrador]]] [breed] [set] update persist
"select * from EAV where Attribute=:attr;" [[":attr" name]] database
[[[Entity #b64 "oGAP/g0nSc6LRwQJJti3HA=="] [Value "Butters"] [Attribute name]]]
2.25.3. Implementation and experiments
Find book title/author that are about "adventure".
[[book? subjects adventure] [book? title title?] [book? author-last author?] [title? author?]
Notes: It does seem like it should be possible to reuse sets of constraints, or produce constraints from a partial object. Maybe make this another layer on top of the datalog.
[[subjects adventure] [title ?] [author-last ?]]
We could even write queries that cross objects - what authors of adventure books were born in 1900:
[[subjects adventure] [author [[birthyear 1900]]]]
Notes: the select clause should use the variable names without the ?
as the column alias. It's not clear when book?
is used in all 3
constraints, which one should it join on? I don't think it matters but
we need to choose somehow. Variables that are only used in one place
should appear in the select clause, should error if it doesn't.
"select c2.value as title, c3.value as author from EAV as c1 join EAV as c2 ON c1.entity = c2.entity AND c1.attribute = \"w|subjects\" AND c1.value = \"w|adventure\" AND c2.attribute = \"w|title\" join EAV as c3 ON c1.entity = c3.entity AND c3.attribute = \"w|author-last\" " database
[[[title "The Lord of the Rings"] [author "Tolkien"]] [[title "Adventures of Huckleberry Finn"] [author "Twain"]] [[title "Don Quixote"] [author "Cervantes"]]]
[[book "author" author] [book "title" title] [author "birthYear" 1945] [author "name" authorName]] [authorName title]
"select c4.value, c2.value from EAV as c1 join EAV as c2 ON c1.entity = c2.entity AND c2.attribute = \"w|title\" AND c1.attribute = \"w|author\" join EAV as c3 ON c1.value = c3.entity AND c3.attribute = \"w|birthYear\" AND c3.value = 1945 join EAV as c4 ON c1.value = c4.entity AND c4.attribute = \"w|name\"" database
[[unwound [database]] [asked [consume]] [reason "not enough items on stack"] [type error] [handled yes]] "select c4.value, c2.value\n from EAV as c1 \n join EAV as c2 ON c1.entity = c2.entity AND c2.attribute = \"w|title\" AND c1.attribute = \"w|author\"\n join EAV as c3 ON c1.value = c3.entity AND c3.attribute = \"w|birthYear\" AND c3.value = 1945\n join EAV as c4 ON c1.value = c4.entity AND c4.attribute = \"w|name\""
Let's copy the input/output here:
[[book? subjects adventure] [book? title title?] [book? author-last author?]] [title? author?] select c2.value as title, c3.value as author from EAV as c1 join EAV as c2 ON c1.entity = c2.entity AND c1.attribute = "w|subjects" AND c1.value = "w|adventure" AND c2.attribute = "w|title" join EAV as c3 ON c1.entity = c3.entity AND c3.attribute = \"w|author-last\"
We need to be able to scan through the constraints and output pairs that are linked via one or more variables. We could create a list of variables like so:
[[book?] [book? title?] [book? author?]]
[[[book? subjects adventure] [book? title title?]] [[book? subjects adventure] [book? author-last author?]] [[book? title title?] [book? author-last author?]]]
Then start with the first row, what other rows have intersection with it? (May need to add some set-based words here).
[[book?] [book? title?] [book? author?]]
;; Module to operate on generic data, that are used throughout [[fork [[execute] map]] [triangle [[[] [[take] dip swap [[put] shield sink swap pair] bail] collect] shielddown]] [indexed-as-property [swap indexed [unwrap sink assign] map dropdown]] [join-all [[first empty] shield swap [join] step]] [selectkeys [set swap [first contains?] filter dropdown]] [invert [[reverse] map association]] ;; datalog variables [variable? [[[word?] [string last \? =]] [execute] every?]] [variable= [[[pair [variable?] every?] [=]] [execute] every?]] ;; datalog constraints [slots [[entity attribute value]]] [slot-combos [slots [slots [pair] map] map join-all]] [constraint [unwrap slots reverse label]] ;; links between datalog constraints [links [slot-combos [[wrap] map unwrap swapdown [[[lookup] inject] both] pairwise variable=] filter [unwrap pair sink pair [[index] lookup] map swap zip] map]] [all-links [[] sink [[[[index] lookup] shield] both ;; lookup the indices of both constraints [swapdown] dip sink ;; move the indices under the constraints [links] shielddown swap [dropdown dropdown join] dip] step drop]] [format-link [[join] inject unwrap [string] map "c{0}.{1} = c{2}.{3}" swap format]] ;; formatting pieces of query data into text [anded-together [" AND " interpose join-all]] ;; where clause data processing [where-data [[[index] lookup] shield swap [[[second variable? not] [first [index] unwrap = not]] [execute] every?] filter [swap prepend] map dropdown]] [format-where [[string] map "c{0}.{1} = :c{0}{1}" swap format]] [make-where [first [where] lookup anded-together]] [format-join [[[[on] lookup] [[where] lookup] [[index] lookup string]] [execute] map [join anded-together] inject "JOIN EAV c{1} ON {0}" swap format]] [make-query [rest [[on] [[format-link] map] update [format-join] shield [join] swap assign] map]] ;; SQL parameters for rusqlite [param-name [[string] map ":c{0}{1}" swap format]] [extract-params [[] association swap [[params] lookup join] step]] ;; SELECT clause [wordval? [second word?]] [invert [[reverse] map association]] [validate [[[second not] [first "All selected query variables must appear somewhere in constraints" [reason variable] label fail] when] map]] [select-data [dump swap [[slots selectkeys invert] shield [wordval?] filter join association] map swap [[[index] swap put ;; make a list of the variable and 'index' wrap [selectkeys] join ;; make the program to cut down map [count 2 =] filter first [second] map [first number?] [reverse] when] ;; items are in random order due to coming from association, fix the order map] shield dropdeep zip validate dump]] ;; query [extract-data [[[[unwrap all-links] [first where-data ; [join] inject unwrap ;; build the query param map [[[] swap [[param-name] shield [last] dip wrap swap assign] step] ;; build the actual query where clauses [[format-where] map]] fork]] fork] shield ;; combine extracted items [first] dip ;; keep the original constraint to add properties to unwrap unwrap [where params on] label join]] [format-select [[unwrap swap string butlast ;; remove the ? from the variable name for result column put [string] map "c{1}.{0} as {2}" swap format] map ", " interpose join-all]]] ;; This is the program we need to modify that is `query` [swap ;; expand all combinations of constraints [constraint] map ;;[] prepend ;; an empty constraint to represent the orignal EAV table we're joining with [index] indexed-as-property triangle ;; for each pair of constraints, build the "ON" clause data for the JOIN [extract-data] map [[extract-params] [make-query] [make-where] [swap select-data]] fork dropdown ;; don't need original anymore unwrap float [[join] lookup] map swap format-select [" " interpose join-all] dip triplet reverse "SELECT {0} from EAV as c0 {1} WHERE {2}" swap format swap dropdeep] let [query] label ;; the above program is the definition of `query` [[[book? subjects love] ;[book? subjects subjects?] [book? title book-title?] [book? author-last author-last?] [book? author-first author-first?] [author? first-name author-first?] [author? last-name author-last?] [author? sex m] [author? country "United States"] [author? birth-year author-birth-year?]] [book-title? author-first? author-last? author-birth-year?] query database ] let execute
[[book-title? author-first? author-last? author-birth-year?] [[[attribute subjects] [entity book?] [index 0] [on []] [params [[":c0attribute" subjects] [":c0value" love]]] [value love] [where ["c0.attribute = :c0attribute" "c0.value = :c0value"]]] [[attribute title] [entity book?] [index 1] [on [[[1 entity] [0 entity]]]] [params [[":c1attribute" title]]] [value book-title?] [where ["c1.attribute = :c1attribute"]]] [[attribute author-last] [entity book?] [index 2] [on [[[2 entity] [0 entity]] [[2 entity] [1 entity]]]] [params [[":c2attribute" author-last]]] [value author-last?] [where ["c2.attribute = :c2attribute"]]] [[attribute author-first] [entity book?] [index 3] [on [[[3 entity] [0 entity]] [[3 entity] [1 entity]] [[3 entity] [2 entity]]]] [params [[":c3attribute" author-first]]] [value author-first?] [where ["c3.attribute = :c3attribute"]]] [[attribute first-name] [entity author?] [index 4] [on [[[4 value] [3 value]]]] [params [[":c4attribute" first-name]]] [value author-first?] [where ["c4.attribute = :c4attribute"]]] [[attribute last-name] [entity author?] [index 5] [on [[[5 value] [2 value]] [[5 entity] [4 entity]]]] [params [[":c5attribute" last-name]]] [value author-last?] [where ["c5.attribute = :c5attribute"]]] [[attribute sex] [entity author?] [index 6] [on [[[6 entity] [4 entity]] [[6 entity] [5 entity]]]] [params [[":c6attribute" sex] [":c6value" m]]] [value m] [where ["c6.attribute = :c6attribute" "c6.value = :c6value"]]] [[attribute country] [entity author?] [index 7] [on [[[7 entity] [4 entity]] [[7 entity] [5 entity]] [[7 entity] [6 entity]]]] [params [[":c7attribute" country] [":c7value" "United States"]]] [value "United States"] [where ["c7.attribute = :c7attribute" "c7.value = :c7value"]]] [[attribute birth-year] [entity author?] [index 8] [on [[[8 entity] [4 entity]] [[8 entity] [5 entity]] [[8 entity] [6 entity]] [[8 entity] [7 entity]]]] [params [[":c8attribute" birth-year]]] [value author-birth-year?] [where ["c8.attribute = :c8attribute"]]]]] [[[book-title? [value 1]] [author-first? [value 3]] [author-last? [value 2]] [author-birth-year? [value 8]]]] [[[author-birth-year 1896] [author-first "F. Scott"] [author-last "Fitzgerald"] [book-title "The Great Gatsby"]]]
[[selectkeys [set [*1 [first] dive contains?] pack filter]] [invert [[reverse] map association]] [slots [[entity attribute value]]]] [[book-title? author-first? author-last? author-birth-year?] [[[attribute subjects] [entity book?] [index 0] [on []] [params [[":c0attribute" subjects] [":c0value" love]]] [value love] [where ["c0.value = :c0value" "c0.attribute = :c0attribute"]]] [[attribute title] [entity book?] [index 1] [on [[[1 entity] [0 entity]]]] [params [[":c1attribute" title]]] [value book-title?] [where ["c1.attribute = :c1attribute"]]] [[attribute author-last] [entity book?] [index 2] [on [[[2 entity] [0 entity]] [[2 entity] [1 entity]]]] [params [[":c2attribute" author-last]]] [value author-last?] [where ["c2.attribute = :c2attribute"]]] [[attribute author-first] [entity book?] [index 3] [on [[[3 entity] [0 entity]] [[3 entity] [1 entity]] [[3 entity] [2 entity]]]] [params [[":c3attribute" author-first]]] [value author-first?] [where ["c3.attribute = :c3attribute"]]] [[attribute first-name] [entity author?] [index 4] [on [[[4 value] [3 value]]]] [params [[":c4attribute" first-name]]] [value author-first?] [where ["c4.attribute = :c4attribute"]]] [[attribute last-name] [entity author?] [index 5] [on [[[5 value] [2 value]] [[5 entity] [4 entity]]]] [params [[":c5attribute" last-name]]] [value author-last?] [where ["c5.attribute = :c5attribute"]]] [[attribute sex] [entity author?] [index 6] [on [[[6 entity] [4 entity]] [[6 entity] [5 entity]]]] [params [[":c6attribute" sex] [":c6value" m]]] [value m] [where ["c6.value = :c6value" "c6.attribute = :c6attribute"]]] [[attribute country] [entity author?] [index 7] [on [[[7 entity] [4 entity]] [[7 entity] [5 entity]] [[7 entity] [6 entity]]]] [params [[":c7attribute" country] [":c7value" "United States"]]] [value "United States"] [where ["c7.attribute = :c7attribute" "c7.value = :c7value"]]] [[attribute birth-year] [entity author?] [index 8] [on [[[8 entity] [4 entity]] [[8 entity] [5 entity]] [[8 entity] [6 entity]] [[8 entity] [7 entity]]]] [params [[":c8attribute" birth-year]]] [value author-birth-year?] [where ["c8.attribute = :c8attribute"]]]] ;; look at data [[[slots selectkeys invert] [[index] selectkeys]] [execute] map ;; get both the index and the values take swap [join] step ] ;; join them together into one assoc map] ;; each constraint row data let execute
[[[book? entity] [index 0] [love value] [subjects attribute]] [[book-title? value] [book? entity] [index 1] [title attribute]] [[author-last attribute] [author-last? value] [book? entity] [index 2]] [[author-first attribute] [author-first? value] [book? entity] [index 3]] [[author-first? value] [author? entity] [first-name attribute] [index 4]] [[author-last? value] [author? entity] [index 5] [last-name attribute]] [[author? entity] [index 6] [m value] [sex attribute]] [[author? entity] [country attribute] [index 7] ["United States" value]] [[author-birth-year? value] [author? entity] [birth-year attribute] [index 8]]] [book-title? author-first? author-last? author-birth-year?]
[[attribute birth-year] [entity author?] [index 8] [on [[[8 entity] [4 entity]] [[8 entity] [5 entity]] [[8 entity] [6 entity]] [[8 entity] [7 entity]]]] [params [[":c8attribute" birth-year]]] [value author-birth-year?] [where ["c8.attribute = :c8attribute"]]] [entity attribute value] set [*1 [first] dive contains?] pack filter
[[a b]]
[[times5 [5 *]]] [6 times5] [draft dictionary swap [emit encode hashbytes] shield [[[words] swap update] shield dropdown] dip sink [dictmerge] shielddeep] dip float wrap [put] join swapdown [modules] swap update [dictionary program] label environment ;; TODO try using confine here [*1 capture evaluate [stack] lookup restore] pack execute
Test that putting the db query builder into a builtin module works
[[book? subjects love] [book? title title?] [book? author-last author-last?]] [author-last? title?] query database
[[[author-last "Fitzgerald"] [title "The Great Gatsby"]] [[author-last "Austen"] [title "Pride and Prejudice"]] [[author-last "Tolstoy"] [title "War and Peace"]] [[author-last "Bronte"] [title "Jane Eyre"]] [[author-last "Dickens"] [title "A Tale of Two Cities"]]]
We need to rewrite the link-finding logic so that it does everything in one pass
[[variable? [[[word?] [string last \? =]] [execute] every?]] [linked-line? [second [first] map set swap contains?]] [slots [[entity attribute value]]] [index->slot [slots indexed]]] [[[book? subjects love] [book? title title?] [book? author-last author-last?] [book? author-first author-first?] [author? first-name author-first?] [author? last-name author-last?] [author? sex f] [author? birth-year birth-year?]] ;; add indices to the constraint rows and internal items [indexed [[1] [indexed] update] map [] swap ;; pull in the row number into each cell [unwrap [swap prepend [swap pair] inject] map] map ;; catenate [joiner] assemble unwrap ;; filter out non variables ; [second variable?] filter [unwrap wrap swap [put] swap prepend update] step ;; anything with only 1 entry is not a link ;[second count 1 >] filter association ] shield ;; now group by relevant line [count] dive 0 swap 1 range [swap [linked-line?] filter] map ] let execute
[[[love [[0 2]]] [book? [[0 0] [1 0] [2 0] [3 0]]] [subjects [[0 1]]]] [[book? [[0 0] [1 0] [2 0] [3 0]]] [title? [[1 2]]] [title [[1 1]]]] [[author-last? [[2 2] [5 2]]] [author-last [[2 1]]] [book? [[0 0] [1 0] [2 0] [3 0]]]] [[book? [[0 0] [1 0] [2 0] [3 0]]] [author-first? [[3 2] [4 2]]] [author-first [[3 1]]]] [[author? [[4 0] [5 0] [6 0] [7 0]]] [author-first? [[3 2] [4 2]]] [first-name [[4 1]]]] [[author-last? [[2 2] [5 2]]] [last-name [[5 1]]] [author? [[4 0] [5 0] [6 0] [7 0]]]] [[f [[6 2]]] [sex [[6 1]]] [author? [[4 0] [5 0] [6 0] [7 0]]]] [[author? [[4 0] [5 0] [6 0] [7 0]]] [birth-year [[7 1]]] [birth-year? [[7 2]]]]] [[author-first [[3 1]]] [author-first? [[3 2] [4 2]]] [author-last [[2 1]]] [author-last? [[2 2] [5 2]]] [author? [[4 0] [5 0] [6 0] [7 0]]] [birth-year [[7 1]]] [birth-year? [[7 2]]] [book? [[0 0] [1 0] [2 0] [3 0]]] [f [[6 2]]] [first-name [[4 1]]] [last-name [[5 1]]] [love [[0 2]]] [sex [[6 1]]] [subjects [[0 1]]] [title [[1 1]]] [title? [[1 2]]]]
[[variable? [[[word?] [string last \? =]] [execute] every?]] [linked-line? [second [first] map set swap contains?]] [slots [[entity attribute value]]] [index->slot [slots indexed]]] [[[book? subjects love] [book? title title?] [book? author-last author-last?] [book? author-first author-first?] [author? first-name author-first?] [author? last-name author-last?] [author? sex f] [author? birth-year birth-year?]] ;; add indices to the constraint rows and internal items [indexed [[1] [indexed] update] map [] swap ;; pull in the row number into each cell [unwrap [swap prepend [swap pair] inject] map] map ;; catenate [joiner] assemble unwrap ;; filter out non variables ; [second variable?] filter [unwrap wrap swap [put] swap prepend update] step ;; anything with only 1 entry is not a link ;[second count 1 >] filter association ] shield ;; now group by relevant line [count] dive 0 swap 1 range [swap [linked-line?] filter] map ] let execute
"SELECT c2.value as title, c3.value as author from EAV as c1 JOIN EAV c2 ON c2.attribute = :c2attribute AND c1.entity = c2.entity JOIN EAV c3 ON c3.attribute = :c3attribute AND c2.entity = c3.entity AND c1.entity = c3.entity WHERE c1.attribute = :c1attribute AND c1.value = :c1value" [[":c1value" adventure] [":c2attribute" title] [":c1attribute" subjects] [":c3attribute" author-last]] database
[[[author "Tolkien"] [title "The Lord of the Rings"]] [[title "Adventures of Huckleberry Finn"] [author "Twain"]] [[title "Don Quixote"] [author "Cervantes"]]]
"SELECT COUNT(*) FROM EAV WHERE value = 's|Don Quixote'" [] database
[[[COUNT(*) 1]]]
[[a b c] [c d e] [e f g]] [second] [first] [[map] shielddown] dip swap [map] dip swap
[a c e] [b d f]
[[book? subjects adventure] [book? title title?]] indexed [[] inject] map [[c0.entity c0.attribute c0.value] ]
[[[entity book?] [attribute subjects] [index 1] [params [[":c1attribute" subjects] [":c1value" adventure]]] [on []] [value adventure] [where ["c1.attribute = :c1attribute" "c1.value = :c1value"]]] [[value title?] [attribute title] [entity book?] [where ["c2.attribute = :c2attribute"]] [on [[[2 entity] [1 entity]]]] [index 2] [params [[":c2attribute" title]]]] [[params [[":c3attribute" author-last]]] [entity book?] [index 3] [attribute author-last] [on [[[3 entity] [2 entity]] [[3 entity] [1 entity]]]] [value author?] [where ["c3.attribute = :c3attribute"]]]] [title? author?] [[selectkeys [set swap [first contains?] filter]] [wordval? [second word?]] [invert [[reverse] map association]] [validate [[[second not] [first "All selected query variables must appear somewhere in constraints" [reason variable] label fail] when] map]]] [swap [[[entity attribute value] selectkeys invert] shield [wordval?] filter join association] map swap [[[index] swap put ;; make a list of the variable and 'index' wrap [selectkeys] join ;; make the program to cut down map [count 2 =] filter first [second] map [first number?] [reverse] when ] map] shield dropdeep zip validate] draft
[[title? [value 2]] [author? [value 3]]]
2.25.4. Tests database module
dictionary [debug-step] [decache inscribe] step [database debug-step] swap [[[book? title title?] [book? author author?] [author? first-name author-first?] [author? country "United States"]] [title? author-first?] query] [program dictionary] label environment swap ;; lm env ;[[stack] [snapshot] divedown assign] dip ;; capture the stack at runtime using ;; set up the resolver eval-step eval-step eval-step eval-step advance advance advance ;; execute the program in the inner environment advance ;evaluate ;[stack] lookup restore ;; replace the stack with the result from the inner env
[[dictionary dictionary_redacted] [program [evaluate [stack] lookup restore]] [resolver [#b64 "qyKhcjHmD5kJNFA6/M0EkQOrh/j2zANmwrQFL1T1w7A=" #b64 "BD6O7rckGISnnk+5AXZaa3/2qY/72dX3O68AED3pG64="]] [stack [[[dictionary dictionary_redacted] [program [swap [constraint] map [index] indexed-as-property triangle [extract-data] map [[extract-params] [make-query] [make-where] [swap select-data]] fork dropdown unwrap float [[join] lookup] map swap format-select [" " interpose join-all] dip triplet reverse "SELECT {0} from EAV as c0 {1} WHERE {2}" swap format swap dropdeep]] [resolver [#b64 "nBeUBXUEzoR2f1PRlrc9UaOCrtcrgU+Bfkz87TTApQ0="]] [stack [[title? author-first?] [[book? title title?] [book? author author?] [author? first-name author-first?] [author? country "United States"]]]]] [title? author-first?] [[book? title title?] [book? author author?] [author? first-name author-first?] [author? country "United States"]]]]]
Debug what's wrong with inner evaluate
[1 2 +] stage [finished? not] [eval-step] while
[[dictionary dictionary_redacted] [program []] [resolver []] [stack [3]]]
write unnest-envs, so we can execute nested envs serially
[3] stage wrap [2] join stage wrap [1] join stage [] swap [environment? not] [] [[stack first]] [execute] recur
[[asked [environment?]] [handled []] [reason "word is not defined"] [type error] [unwound [environment? not [[[dictionary dictionary_redacted] [program [[[dictionary dictionary_redacted] [program [[[dictionary dictionary_redacted] [program [3]] [resolver []] [stack []]] 2]] [resolver []] [stack []]] 1]] [resolver []] [stack []]] []] evert first [] [[stack first] [[environment? not] [] [[stack first]] [execute] recur] execute] branch]]] [[dictionary dictionary_redacted] [program [[[dictionary dictionary_redacted] [program [[[dictionary dictionary_redacted] [program [3]] [resolver []] [stack []]] 2]] [resolver []] [stack []]] 1]] [resolver []] [stack []]] []
2.26. TODO Reduce CPU cost of `shield` optimization
[[program [1 2 3 [+] shield]]] environment stepper 0 [drop inc] cram
25 [eval-step clone] []
[[program [1 2 3 [+] [] evert clone evert [execute] dip first swap; [restore] dip dropdown ;first take drop swap prepend restore ; take clonedown dip ]]] environment stepper collect ;0 [drop inc] cram
[[[program [2 3 [+] [] evert clone evert [execute] dip first swap]] [stack [1]]] [[program [3 [+] [] evert clone evert [execute] dip first swap]] [stack [2 1]]] [[stack [3 2 1]] [program [[+] [] evert clone evert [execute] dip first swap]]] [[stack [[+] 3 2 1]] [program [[] evert clone evert [execute] dip first swap]]] [[stack [[] [+] 3 2 1]] [program [evert clone evert [execute] dip first swap]]] [[program [clone evert [execute] dip first swap]] [stack [[[+] 3 2 1]]]] [[stack [[[+] 3 2 1] [[+] 3 2 1]]] [program [evert [execute] dip first swap]]] [[program [[execute] dip first swap]] [stack [[[[+] 3 2 1]] [+] 3 2 1]]] [[program [dip first swap]] [stack [[execute] [[[+] 3 2 1]] [+] 3 2 1]]] [[stack [[+] 3 2 1]] [program [execute [[[[+] 3 2 1]]] unwrap first swap]]] [[stack [3 2 1]] [program [+ [[[[+] 3 2 1]]] unwrap first swap]]] [[stack [5 1]] [program [[[[[+] 3 2 1]]] unwrap first swap]]] [[stack [[[[[+] 3 2 1]]] 5 1]] [program [unwrap first swap]]] [[program [first swap]] [stack [[[[+] 3 2 1]] 5 1]]] [[program [swap]] [stack [[[+] 3 2 1] 5 1]]] [[stack [5 [[+] 3 2 1] 1]] [program []]]] [eval-step clone] []
2.27. TODO Sort out feature dependencies
digraph G { // Define nodes core [label="Core words", style=filled, fillcolor=lightgreen, shape=rect]; pipes [label="I/O pipes", style=filled, fillcolor=lightgreen, shape=rect]; generators [label="Generators", style=filled, fillcolor=lightgreen, shape=rect]; debug [label="Debugger", style=filled, fillcolor=lightgreen, shape=rect]; crypto [label="Cryptography", style=filled, fillcolor=yellow, shape=rect]; auth [label="Authentication Scripting", style=filled, fillcolor=yellow, shape=rect]; localmod [label="Local Modules", style=filled, fillcolor=yellow, shape=rect]; remotemod [label="Remote Modules"]; revocation [label="Revocation Lists"]; db [label="Persistent Database", style=filled, fillcolor=yellow, shape=rect]; storage [label="Storage", fillcolor=lightgreen]; cache [label="Hash-keyed blob cache", style=filled, fillcolow=yellow, shape=rect]; streams [label="Content streams"]; payments [label="Payments BTC"]; contacts [label="Address Book"]; messaging [label="Messaging"]; backup [label="Data backup"]; keys [label="Encryption Key management"]; peers [label="Peer discovery"]; names [label="Naming"]; // Define edges to represent dependencies pipes -> core; generators -> core; debug -> core; crypto -> core; localmod -> core; localmod -> cache; localmod -> names; cache -> core; cache -> storage; auth -> crypto; auth -> revocation; remotemod -> localmod; remotemod -> auth; revocation -> db; db -> core; db -> localmod; db -> pipes; db -> storage; // Application level messaging -> contacts; contacts -> auth; contacts -> db; messaging -> db; messaging -> streams; streams -> db; streams -> names; streams -> peers; peers -> crypto; streams -> pipes; streams -> keys; //remotemod -> streams; keys -> db; contacts -> streams; revocation -> streams; backup -> streams; backup -> payments; names -> storage; names -> crypto; }
2.28. TODO Improved error messages errorHandling
2.28.1. TODO Source mapping debugging errorHandling
It would be nice if we could tell which file any given word was read from. We could do this at read time, but i don't think our edn parser remembers byte positions, so that might need modification.
2.28.2. TODO Causal chaining
Like java exceptions, we'd like to be able to say what other error caused this one. Then the chain would go from the most general to the most specific, eg "could not load library", "could not open file", "permission denied".
The unwound field could be shortened too, to just whatever is extra, similar to java's eliding of common stack elements in chained exceptions.
2.29. INPROGRESS Generate word dependency graph
- State "INPROGRESS" from "TODO"
dictionary ;; nodes [[[first string wrap "\"{}\";\n" swap format] map "" swap [join] step] ;; now do edges [[take] ;; only non-builtins (not sure this works) [[1 definition] lookup [[list?] []] [execute] every?] keep ;; throw away everthing but the definition [[1] [[definition] lookup ;; reduce to a set of words [] set swap [list? not] [[word?] [put] [drop] if] [] [step] recur] update] each ;10 taker ; collect ;; remove this to process the whole dict ;; expand to an edge for each pair [] [[second not] [drop generate] when ;; extract a pair [second] [pop pop [[first] divedown] shield swap pair [put] dip] when] ;; the 2nd item in the pair isn't empty [[string] map "\"{}\" -> \"{}\";\n" swap format] each joiner generate]] [execute] map "digraph G { {} {} }" swap format [[file "/tmp/graph4.dot"]] pipe-in swap encode put
[[to [[file "/tmp/graph4.dot"]]] [type tunnel] [values [[type bytes]]]] dictionary_redacted
;; walk the definition to extract words [[not] join [something?] swap pair wrap [[execute] every?] join [clone [[generate] dip [drop generate] while] dive]] [] set swap [list? not] [put] [] [step] recur
[clone dip dive drop every? execute generate join not pair something? swap while wrap]
Extract one item from inner
[foo []] ;pop pop [[first] divedown] shield swap pair [put] dip [second] [pop pop [[first] divedown] shield swap pair [put] dip] when ;; the 2nd item in the pair isn't empty
[foo []]
The resulting graph is very hard to read.
2.30. INPROGRESS Let doesn't inherit the current resolver
- State "INPROGRESS" from "TODO"
We need to capture the resolver like we do with the dictionary and append to it. This is probably a code smell that says the resolver should be part of the dictionary. Just bite the bullet and make the dictionary a struct.
using
now does capture the current resolver and extends it in the
front. Needs testing, I suspect the ordering might be a bit buggy.
As for sandboxing I think the solution was we have to actually remove words we mean to make inaccessible so resolution order doesn't really matter anymore. In fact we may be able to ignore deleted words in the merging of dictionaries. Removing words might not be a viable feature of modules.
2.31. DONE Make templating a rust function
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
Templating is important enough to make it a rust function. No kcats programs need to be called during the execution of templating so I think it's easily done in rust. Then that solves the chicken/egg problem where other stuff depends on template but template depends on it.
2.32. TODO Add description to each example testing
2.33. TODO Add integration tests testing
2.34. TODO Size of option enums
use std::mem; extern crate internment; use internment::Intern; fn main() { // Let's suppose this is our large type struct MyLargeType(Vec<u8>); // Using Intern to intern Vec<u8> type InternedLargeType = Intern<Vec<u8>>; // Size of Intern<Vec<u8>> println!("Size of Intern<Vec<u8>>: {}", mem::size_of::<Intern<Vec<u8>>>()); // Size of Option<Intern<Vec<u8>>> println!("Size of Option<Intern<Vec<u8>>>: {}", mem::size_of::<Option<Intern<Vec<u8>>>>()); // Example using MyLargeType with Intern let interned_value: InternedLargeType = Intern::new(vec![0; 32]); let instance_with_intern = Option::Some(interned_value); println!("Size of instance_with_intern: {}", mem::size_of_val(&instance_with_intern)); // Another way to illustrate let interned_value_none: Option<InternedLargeType> = Option::None; let interned_value_some: Option<InternedLargeType> = Option::Some(Intern::new(vec![0; 32])); println!("Size of interned_value_none: {}", mem::size_of_val(&interned_value_none)); println!("Size of interned_value_some: {}", mem::size_of_val(&interned_value_some)); }
Size of Intern<Vec<u8>>: 8 Size of Option<Intern<Vec<u8>>>: 8 Size of instance_with_intern: 8 Size of interned_value_none: 8 Size of interned_value_some: 8
2.35. TODO Support converting Association to Set
This should probably return a set of keys, but the issue is the way
Derive works, we can't choose a different conversion based on the
final type we want. So we end up with a list of pairs because the
first step in the conversion is to a Iterator<Entry>. Until this is
fixed we'll comment out the relevant test in the word intersection
[[[[a b] [c d] [e f]] association [[c x] [e y]] association intersection] [[c e] set] "Intersection of two associations expressed as set of common keys"]
2.36. INPROGRESS Debug nested envs
- State "INPROGRESS" from "TODO"
The problem is if we have this
[1 2 +] stage wrap [evaluate] [program stack] label environment wrap [evaluate] [program stack] label environment
[[dictionary dictionary_redacted] [program [evaluate]] [resolver []] [stack [[[dictionary dictionary_redacted] [program [evaluate]] [resolver []] [stack [[[dictionary dictionary_redacted] [program [1 2 +]] [resolver []] [stack []]]]]]]]]
How do we step through the execution? I think perhaps we need to do this:
Let's say we want to eval-step
. The ToS is an env (or eval-step would
fail anyway). If that env is executing evaluate
we recur down the envs
until we hit one that's not, and run eval-step there.
The question is, do we have to unnest everything and then re-nest?
I think we can just build a path to pass to update. And then afterward recur again to drop evaluate from programs that are already done.
;; Make the nested envs [1 2 +] stage wrap [evaluate] [program stack] label environment wrap [evaluate] [program stack] label environment ;; stepping program [eval-step] [[evaluating? [[program] lookup [evaluate] starts?]]] [[[] swap [[evaluating?] [[stack 0] clone [lookup] dip swap [join] dip ] ;; append the next part of the path to the accumulator while swap] shield dropdeep] dip ;; under the stepping prog [update] shielddown flip drop [0 -2 slice clone] [collect] shielddeep [[[[[evaluating?] [[stack 0] lookup finished?]] [execute] every?] [[program] [rest] update] when] [update] shielddown flip drop drop] step] let execute
[[dictionary dictionary_redacted] [program [evaluate]] [resolver []] [stack [[[dictionary dictionary_redacted] [program [evaluate]] [resolver []] [stack [[[dictionary dictionary_redacted] [program [2 +]] [resolver []] [stack [1]]]]]]]]]
5 execute
[[actual 5] [asked [program]] [handled []] [reason "type mismatch"] [type error] [unwound [execute]]] 5
[1 2 +] stage wrap [0] [eval-step] update
[[[dictionary dictionary_redacted] [program [2 +]] [resolver []] [stack [1]]]]
[1 2 [3 4 [5 6 [7 8]]]] [2 2 0] [inc] update
[1 2 [3 4 [6 6 [7 8]]]]
2.36.1. DONE write environment? word
- State "DONE" from "INPROGRESS"
- State "INPROGRESS" from "TODO"
Will be an axiom word.