serialization/deserialization of datum in VM
All checks were successful
per-push tests / build (push) Successful in 1m34s
per-push tests / test-frontend (push) Successful in 41s
per-push tests / test-utility (push) Successful in 45s
per-push tests / test-backend (push) Successful in 44s
per-push tests / timed-decomposer-parse (push) Successful in 50s

This commit adds logic to serialize and deserialize datum, as well
as the start of some total binary format. It implements serialize
and deserialize routines per datum type. Tests are included for
comples cases. Similar code existed in the organelle package which was
then centralized here.

Additionally: this commit makes release target binaries smaller and
faster

Signed-off-by: Ava Affine <ava@sunnypup.io>
This commit is contained in:
Ava Apples Affine 2025-08-26 17:11:37 +00:00
parent 0f85292e6f
commit 389bf6e9a0
6 changed files with 471 additions and 84 deletions

View file

@ -1,3 +1,13 @@
cargo-features = ["profile-rustflags"]
[workspace] [workspace]
resolver = "2" resolver = "2"
members = ["mycelium", "decomposer", "hyphae", "organelle"] members = ["mycelium", "decomposer", "hyphae", "organelle"]
[profile.release]
opt-level = 3
strip = true
lto = true
codegen-units = 1
panic = "abort"
rustflags = [ "-Zlocation-detail=none", "-Zfmt-debug=none" ]

View file

@ -15,15 +15,20 @@
* along with this program. If not, see <https://www.gnu.org/licenses/>. * along with this program. If not, see <https://www.gnu.org/licenses/>.
*/ */
use crate::serializer::DeserializerControlCode;
use core::ops::{Index, Deref, DerefMut}; use core::ops::{Index, Deref, DerefMut};
use core::ptr::NonNull; use core::ptr::NonNull;
use alloc::{vec, vec::Vec};
use alloc::rc::Rc; use alloc::rc::Rc;
use alloc::vec::Vec;
use alloc::boxed::Box; use alloc::boxed::Box;
use alloc::fmt::Debug; use alloc::fmt::Debug;
use organelle::Number; use organelle::{Number, Fraction, SymbolicNumber, Float, ScientificNotation};
const US: usize = (usize::BITS / 8) as usize;
const IS: usize = (isize::BITS / 8) as usize;
/* NOTE /* NOTE
* decided not to implement a cache or a singleton heap manager * decided not to implement a cache or a singleton heap manager
@ -141,40 +146,6 @@ impl<T: Clone> Gc<T> {
} }
} }
#[derive(PartialEq, Debug)]
pub enum Datum {
Number(Number),
Bool(bool),
Cons(Cons),
Char(u8),
String(Vec<u8>),
Vector(Vec<Gc<Datum>>),
ByteVector(Vec<u8>),
None
}
// implemented by hand to force deep copy on Cons datum
impl Clone for Datum {
fn clone(&self) -> Datum {
match self {
Datum::Number(n) => Datum::Number(n.clone()),
Datum::Bool(n) => Datum::Bool(n.clone()),
Datum::Cons(n) => Datum::Cons(n.deep_copy()),
Datum::Char(n) => Datum::Char(n.clone()),
Datum::String(n) => Datum::String(n.clone()),
Datum::Vector(n) =>
Datum::Vector(n.clone()),
Datum::ByteVector(n) =>
Datum::ByteVector(n.clone()),
Datum::None => Datum::None,
}
}
fn clone_from(&mut self, source: &Self) {
*self = source.clone();
}
}
#[derive(Clone, PartialEq, Debug)] #[derive(Clone, PartialEq, Debug)]
pub struct Cons(pub Option<Gc<Datum>>, pub Option<Gc<Datum>>); pub struct Cons(pub Option<Gc<Datum>>, pub Option<Gc<Datum>>);
@ -293,6 +264,260 @@ impl Index<usize> for Cons {
} }
} }
#[derive(PartialEq, Debug)]
pub enum Datum {
Number(Number),
Bool(bool),
Cons(Cons),
Char(u8),
String(Vec<u8>),
Vector(Vec<Gc<Datum>>),
ByteVector(Vec<u8>),
None
}
// implemented by hand to force deep copy on Cons datum
impl Clone for Datum {
fn clone(&self) -> Datum {
match self {
Datum::Number(n) => Datum::Number(n.clone()),
Datum::Bool(n) => Datum::Bool(n.clone()),
Datum::Cons(n) => Datum::Cons(n.deep_copy()),
Datum::Char(n) => Datum::Char(n.clone()),
Datum::String(n) => Datum::String(n.clone()),
Datum::Vector(n) =>
Datum::Vector(n.clone()),
Datum::ByteVector(n) =>
Datum::ByteVector(n.clone()),
Datum::None => Datum::None,
}
}
fn clone_from(&mut self, source: &Self) {
*self = source.clone();
}
}
impl Into<Vec<u8>> for Datum {
fn into(self) -> Vec<u8> {
match self {
Datum::Number(n) => {
let mut out: Vec<u8> = vec![];
match n {
Number::Sci(num) => {
out.push(DeserializerControlCode::SciNumber as u8);
for ele in num.0.to_be_bytes().iter() {
out.push(*ele);
}
for ele in num.1.to_be_bytes().iter() {
out.push(*ele);
}
out
},
Number::Flt(num) => {
out.push(DeserializerControlCode::FltNumber as u8);
for ele in num.0.to_be_bytes().iter() {
out.push(*ele);
}
out
},
Number::Fra(num) => {
out.push(DeserializerControlCode::FraNumber as u8);
for ele in num.0.to_be_bytes().iter() {
out.push(*ele);
}
for ele in num.1.to_be_bytes().iter() {
out.push(*ele);
}
out
},
Number::Sym(num) => {
match num {
SymbolicNumber::Inf => out.push(DeserializerControlCode::SymInf as u8),
SymbolicNumber::NaN => out.push(DeserializerControlCode::SymNan as u8),
SymbolicNumber::NegInf => out.push(DeserializerControlCode::SymNegInf as u8),
SymbolicNumber::NegNan => out.push(DeserializerControlCode::SymNegNan as u8),
}
out
}
}
},
Datum::Bool(b) if !b => vec![DeserializerControlCode::BoolFalse as u8],
Datum::Bool(b) if b => vec![DeserializerControlCode::BoolTrue as u8],
Datum::Bool(_) => panic!("rustc somehow has a third bool!"),
Datum::Cons(c) => {
if let Some(lop) = &c.0 {
if let Some(rop) = &c.1 {
let mut out = vec![DeserializerControlCode::FullCons as u8];
out.append(&mut (*lop.deref()).clone().into());
out.append(&mut (*rop.deref()).clone().into());
out
} else {
let mut out = vec![DeserializerControlCode::LeftCons as u8];
out.append(&mut (*lop.deref()).clone().into());
out
}
} else {
if let Some(rop) = &c.1 {
let mut out = vec![DeserializerControlCode::RightCons as u8];
out.append(&mut (*rop.deref()).clone().into());
out
} else {
vec![DeserializerControlCode::EmptyCons as u8]
}
}
},
Datum::Char(c) => vec![DeserializerControlCode::Char as u8, c],
Datum::String(c) => {
let mut v = vec![DeserializerControlCode::String as u8];
v.append(&mut c.len().to_be_bytes().to_vec());
v.append(&mut c.clone());
v
},
Datum::ByteVector(c) => {
let mut v = vec![DeserializerControlCode::ByteVec as u8];
v.append(&mut c.len().to_be_bytes().to_vec());
v.append(&mut c.clone());
v
},
Datum::Vector(c) => {
let mut v = vec![DeserializerControlCode::Vector as u8];
v.append(&mut c.len().to_be_bytes().to_vec());
c.iter().for_each(|i| v.append(&mut (*i.deref()).clone().into()));
v
},
Datum::None => vec![],
}
}
}
impl TryFrom<&[u8]> for Datum {
type Error = &'static str;
fn try_from(value: &[u8]) -> Result<Self, Self::Error> {
match DeserializerControlCode::try_from(value[0])? {
// this entire block goes away when we finish redoing organelle
DeserializerControlCode::SymInf =>
Ok(Datum::Number(Number::Sym(SymbolicNumber::Inf))),
DeserializerControlCode::SymNan =>
Ok(Datum::Number(Number::Sym(SymbolicNumber::NaN))),
DeserializerControlCode::SymNegInf =>
Ok(Datum::Number(Number::Sym(SymbolicNumber::NegInf))),
DeserializerControlCode::SymNegNan =>
Ok(Datum::Number(Number::Sym(SymbolicNumber::NegNan))),
DeserializerControlCode::SciNumber if value.len() >= 1 + 4 + IS => {
let i = f32::from_be_bytes(value[1..5].try_into().unwrap());
let j = isize::from_be_bytes(value[5..(5 + IS)].try_into().unwrap());
Ok(Datum::Number(Number::Sci(ScientificNotation(i, j))))
},
DeserializerControlCode::FltNumber if value.len() >= 9 => {
let i = f64::from_be_bytes(value[1..9].try_into().unwrap());
Ok(Datum::Number(Number::Flt(Float(i))))
},
DeserializerControlCode::FraNumber if value.len() >= 1 + (IS * 2) => {
let i = isize::from_be_bytes(value[1..(1 + IS)].try_into().unwrap());
let j = isize::from_be_bytes(value[(1 + IS)..(1 + IS + IS)].try_into().unwrap());
Ok(Datum::Number(Number::Fra(Fraction(i, j))))
},
DeserializerControlCode::BoolFalse => Ok(Datum::Bool(false)),
DeserializerControlCode::BoolTrue => Ok(Datum::Bool(true)),
DeserializerControlCode::EmptyCons if value.len() >= 1 =>
Ok(Datum::Cons(Cons(None, None))),
DeserializerControlCode::Char if value.len() >= 2 =>
Ok(Datum::Char(value[1])),
DeserializerControlCode::String if value.len() >= 1 + US => {
let len = usize::from_be_bytes(value[1..(1 + US)].try_into().unwrap());
if len < 1 {
Ok(Datum::String(vec![]))
} else if value.len() - (1 + US) < len {
Err("String vector backing is corrupted or truncated!")
} else {
Ok(Datum::String(value[(1 + US)..(1 + US + len)].to_vec()))
}
},
DeserializerControlCode::ByteVec if value.len() >= 1 + US => {
let len = usize::from_be_bytes(value[1..(1 + US)].try_into().unwrap());
if len < 1 {
Ok(Datum::ByteVector(vec![]))
} else if value.len() - (1 + US) < len {
Err("ByteVector vector backing is corrupted or truncated!")
} else {
Ok(Datum::ByteVector(value[(1 + US)..(1 + US + len)].to_vec()))
}
},
DeserializerControlCode::Vector if value.len() >= 1 + US => {
let len = usize::from_be_bytes(value[1..(1 + US)].try_into().unwrap());
if len < 1 {
Ok(Datum::Vector(vec![]))
} else {
let mut cursor: usize = 1 + US;
let mut ovec: Vec<Gc<Datum>> = vec![];
for _ in 0..len {
ovec.push(Datum::try_from(&value[cursor..])?.into());
cursor += ovec.last().unwrap().byte_length();
}
Ok(Datum::Vector(ovec))
}
},
DeserializerControlCode::LeftCons if value.len() >= 2 =>
Ok(Datum::Cons(Cons(Some(Datum::try_from(&value[1..])?.into()), None))),
DeserializerControlCode::RightCons if value.len() >= 2 =>
Ok(Datum::Cons(Cons(None, Some(Datum::try_from(&value[1..])?.into())))),
DeserializerControlCode::FullCons if value.len() >= 3 => {
let lop = Datum::try_from(&value[1..])?;
let next = 1 + lop.byte_length();
let rop = Datum::try_from(&value[next..])?;
Ok(Datum::Cons(Cons(Some(lop.into()), Some(rop.into()))))
}
_ => Err("Deserializer Control Code not valid in this context")
}
}
}
impl Datum {
pub fn byte_length(&self) -> usize {
match self {
Datum::None => 0,
Datum::Bool(_) => 1,
Datum::Char(_) => 2,
// This will need to change with organelle
Datum::Number(n) => match n {
Number::Sym(_) => 1 as usize,
Number::Flt(_) => 1 + 8 as usize,
Number::Sci(_) => 1 + 4 + (isize::BITS / 8) as usize,
Number::Fra(_) => 1 + ((usize::BITS / 8) * 2) as usize,
},
Datum::String(s) => 1 + US + s.len(),
Datum::ByteVector(s) => 1 + US + s.len(),
Datum::Vector(s) => {
let mut c = 1 + US;
for i in s.iter() {
c += i.byte_length();
}
c
},
Datum::Cons(c) => {
let mut size = 1;
c.0.as_ref().and_then(|x| {
size += x.byte_length();
Some(())
});
c.1.as_ref().and_then(|x| {
size += x.byte_length();
Some(())
});
size
},
}
}
}
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
@ -362,4 +587,28 @@ mod tests {
drop(reference_holder); drop(reference_holder);
assert!(!*(*copied_data).0); assert!(!*(*copied_data).0);
} }
#[test]
fn serialize_deserialize_datum_tests() {
let cases = vec![
Datum::Number("2/3".parse::<Number>().unwrap()),
Datum::Number("-4/5".parse::<Number>().unwrap()),
Datum::Number("2e45".parse::<Number>().unwrap()),
Datum::Number("1.2432566".parse::<Number>().unwrap()),
Datum::Number("+inf.0".parse::<Number>().unwrap()),
Datum::Cons(Cons(Some(Datum::Bool(true).into()), Some(Datum::Bool(false).into()))),
Datum::Cons(Cons(None, Some(Datum::Bool(true).into()))),
Datum::Cons(Cons(Some(Datum::Bool(true).into()), None)),
Datum::Cons(Cons(None, None)),
Datum::Cons(Cons(Some(Datum::Cons(Cons(None, Some(Datum::Bool(false).into()))).into()), None)),
Datum::Vector(vec![Datum::Bool(true).into(), Datum::Bool(true).into(), Datum::Bool(false).into()]),
Datum::Vector(vec![]),
Datum::Vector(vec![Datum::Vector(vec![Datum::Bool(true).into()]).into(), Datum::Bool(false).into()]),
];
for i in cases.iter() {
let j: Vec<u8> = i.clone().into();
assert_eq!(*i, Datum::try_from(j.as_slice()).unwrap());
}
}
} }

View file

@ -21,7 +21,7 @@ pub mod hmap;
pub mod stackstack; pub mod stackstack;
pub mod instr; pub mod instr;
pub mod vm; pub mod vm;
pub mod util; pub mod serializer;
pub mod heap; pub mod heap;
extern crate alloc; extern crate alloc;

View file

@ -16,6 +16,7 @@
*/ */
use crate::instr::Operation; use crate::instr::Operation;
use crate::heap::Datum;
use alloc::vec::Vec; use alloc::vec::Vec;
use alloc::vec; use alloc::vec;
@ -23,6 +24,31 @@ use alloc::vec;
use core::ops::Index; use core::ops::Index;
use core::mem::transmute; use core::mem::transmute;
#[repr(u8)]
#[derive(Debug, Clone, PartialEq)]
pub enum DeserializerControlCode {
SciNumber = 0x00,
FltNumber = 0x01,
FraNumber = 0x02,
SymInf = 0x03,
SymNan = 0x04,
SymNegInf = 0x05,
SymNegNan = 0x06,
BoolFalse = 0x07,
BoolTrue = 0x08,
Char = 0x09,
String = 0x0A,
ByteVec = 0x0B,
Vector = 0x0C,
EmptyCons = 0x0D,
LeftCons = 0x0E,
RightCons = 0x0F,
FullCons = 0x10,
DataChunk = 0x11,
CodeChunk = 0x12,
}
#[repr(u8)] #[repr(u8)]
#[derive(Debug, Clone, PartialEq)] #[derive(Debug, Clone, PartialEq)]
pub enum Address { pub enum Address {
@ -38,6 +64,12 @@ pub enum Address {
Char = 0xfa, // immutable access only Char = 0xfa, // immutable access only
} }
#[derive(Debug, Clone, PartialEq)]
pub struct Deserializer<'a> {
pub input: &'a [u8],
// TODO: Debug levels for errors
}
#[derive(Debug, Clone, PartialEq)] #[derive(Debug, Clone, PartialEq)]
pub struct Operand(pub Address, pub usize); pub struct Operand(pub Address, pub usize);
@ -45,7 +77,7 @@ pub struct Operand(pub Address, pub usize);
pub struct Instruction(pub Operation, pub Vec<Operand>); pub struct Instruction(pub Operation, pub Vec<Operand>);
#[derive(Debug, Clone, PartialEq)] #[derive(Debug, Clone, PartialEq)]
pub struct Program(pub Vec<Instruction>); pub struct Program(pub Vec<Datum>, pub Vec<Instruction>);
impl Into<u8> for Address { impl Into<u8> for Address {
fn into(self) -> u8 { fn into(self) -> u8 {
@ -162,16 +194,39 @@ impl Instruction {
impl TryFrom<&[u8]> for Program { impl TryFrom<&[u8]> for Program {
type Error = &'static str; type Error = &'static str;
fn try_from(value: &[u8]) -> Result<Self, Self::Error> { fn try_from(value: &[u8]) -> Result<Self, Self::Error> {
let mut data: Vec<Datum> = vec![];
let mut prog: Vec<Instruction> = vec![]; let mut prog: Vec<Instruction> = vec![];
let mut cur = 0;
let mut parse_data = || -> Result<usize, Self::Error> {
let mut cur = 0;
if value[cur] != DeserializerControlCode::DataChunk as u8 {
return Ok(cur);
}
cur += 1;
while value[cur] != DeserializerControlCode::CodeChunk as u8 {
let datum: Datum = value[cur..].try_into()?;
cur += datum.byte_length();
data.push(datum);
}
Ok(cur)
};
let mut parse_code = |cur: usize| -> Result<(), Self::Error> {
let mut cur = cur;
if value[cur] != DeserializerControlCode::CodeChunk as u8 {
return Err("no code chunk detected in program");
}
cur += 1;
while cur < value.len() { while cur < value.len() {
let instruction: Instruction = value[cur..].try_into()?; let instruction: Instruction = value[cur..].try_into()?;
cur += instruction.byte_length() as usize; cur += instruction.byte_length() as usize;
prog.push(instruction); prog.push(instruction);
} }
Ok(())
};
Ok(Program(prog)) parse_code(parse_data()?)?;
Ok(Program(data, prog))
} }
} }
@ -188,10 +243,37 @@ impl Into<Vec<u8>> for Program {
impl<'a> Index<usize> for Program { impl<'a> Index<usize> for Program {
type Output = Instruction; type Output = Instruction;
fn index(&self, index: usize) -> &Instruction { fn index(&self, index: usize) -> &Instruction {
self.0.get(index).expect("access to out of bounds instruction in vm") self.1.get(index).expect("access to out of bounds instruction in vm")
} }
} }
impl TryFrom<u8> for DeserializerControlCode {
type Error = &'static str;
fn try_from(value: u8) -> Result<Self, Self::Error> {
match value {
0x00 => Ok(DeserializerControlCode::SciNumber),
0x01 => Ok(DeserializerControlCode::FltNumber),
0x02 => Ok(DeserializerControlCode::FraNumber),
0x03 => Ok(DeserializerControlCode::SymInf),
0x04 => Ok(DeserializerControlCode::SymNan),
0x05 => Ok(DeserializerControlCode::SymNegInf),
0x06 => Ok(DeserializerControlCode::SymNegNan),
0x07 => Ok(DeserializerControlCode::BoolFalse),
0x08 => Ok(DeserializerControlCode::BoolTrue),
0x09 => Ok(DeserializerControlCode::Char),
0x0A => Ok(DeserializerControlCode::String),
0x0B => Ok(DeserializerControlCode::ByteVec),
0x0C => Ok(DeserializerControlCode::Vector),
0x0D => Ok(DeserializerControlCode::EmptyCons),
0x0E => Ok(DeserializerControlCode::LeftCons),
0x0F => Ok(DeserializerControlCode::RightCons),
0x10 => Ok(DeserializerControlCode::FullCons),
0x11 => Ok(DeserializerControlCode::DataChunk),
0x12 => Ok(DeserializerControlCode::CodeChunk),
_ => Err("invalid control code")
}
}
}
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
@ -279,15 +361,24 @@ mod tests {
#[test] #[test]
fn test_program_parse() { fn test_program_parse() {
let bytes1 = [instr::LINK.0, 0xf3, 0xf4]; let bytes1 = [
let out1 = vec![Instruction(instr::LINK, DeserializerControlCode::DataChunk as u8,
DeserializerControlCode::BoolTrue as u8,
DeserializerControlCode::CodeChunk as u8,
instr::LINK.0, 0xf3, 0xf4
];
let out1a = vec![Datum::Bool(true)];
let out1b = vec![Instruction(instr::LINK,
vec![Operand(Address::Oper1, 0), Operand(Address::Oper2, 0)])]; vec![Operand(Address::Oper1, 0), Operand(Address::Oper2, 0)])];
let res1 = let res1 =
TryInto::<Program>::try_into(&bytes1[..]); TryInto::<Program>::try_into(&bytes1[..]);
assert!(res1.is_ok()); assert!(res1.is_ok());
assert_eq!(res1.unwrap().0, out1); let res1 = res1.unwrap();
assert_eq!(res1.0, out1a);
assert_eq!(res1.1, out1b);
let bytes2 = [ let bytes2 = [
DeserializerControlCode::CodeChunk as u8,
instr::LINK.0, 0xf3, 0xf4, instr::LINK.0, 0xf3, 0xf4,
instr::CLEAR.0, 0xf0, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 instr::CLEAR.0, 0xf0, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
]; ];
@ -304,6 +395,20 @@ mod tests {
let res2 = let res2 =
TryInto::<Program>::try_into(&bytes2[..]); TryInto::<Program>::try_into(&bytes2[..]);
assert!(res2.is_ok()); assert!(res2.is_ok());
assert_eq!(res2.unwrap().0, out2); assert_eq!(res2.unwrap().1, out2);
}
#[test]
fn test_serializer_control_code_consistency() {
let mut input: u8 = 0x00;
loop {
if DeserializerControlCode::try_from(input)
.and_then(|x| Ok(assert!(x as u8 == input)))
.is_err() {
break;
}
input += 1;
}
} }
} }

View file

@ -21,7 +21,7 @@ use organelle::{Fraction, Number, Numeric};
use crate::hmap::QuickMap; use crate::hmap::QuickMap;
use crate::stackstack::StackStack; use crate::stackstack::StackStack;
use crate::instr as i; use crate::instr as i;
use crate::util::{Operand, Program, Address}; use crate::serializer::{Operand, Program, Address, Instruction};
use crate::heap::{Gc, Datum, Cons}; use crate::heap::{Gc, Datum, Cons};
use core::ops::DerefMut; use core::ops::DerefMut;
@ -42,7 +42,7 @@ pub struct VM {
// execution environment // execution environment
pub stack: StackStack<Gc<Datum>>, pub stack: StackStack<Gc<Datum>>,
pub symtab: QuickMap<Operand>, pub symtab: QuickMap<Operand>,
pub prog: Program, pub prog: Vec<Instruction>,
pub traps: Vec<Arc<dyn Fn(&mut VM)>>, pub traps: Vec<Arc<dyn Fn(&mut VM)>>,
// data registers // data registers
@ -61,10 +61,14 @@ pub struct VM {
impl From<Program> for VM { impl From<Program> for VM {
fn from(value: Program) -> Self { fn from(value: Program) -> Self {
let mut s = StackStack::<Gc<Datum>>::new();
value.0.iter()
.for_each(|d| s.push_current_stack(d.clone().into()));
VM{ VM{
stack: StackStack::<Gc<Datum>>::new(), stack: s,
symtab: QuickMap::new(), symtab: QuickMap::new(),
prog: value, prog: value.1,
traps: vec![], traps: vec![],
expr: Datum::None.into(), expr: Datum::None.into(),
oper: array::from_fn(|_| Datum::None.into()), oper: array::from_fn(|_| Datum::None.into()),
@ -88,7 +92,7 @@ impl VM {
.with_stack(stack) .with_stack(stack)
.with_symbols(syms) .with_symbols(syms)
.with_traps(traps) .with_traps(traps)
.to_owned() // not efficient, but we are not executing .to_owned() // not efficient
} }
pub fn with_stack( pub fn with_stack(
@ -125,11 +129,11 @@ impl VM {
} }
pub fn run_program(&mut self) { pub fn run_program(&mut self) {
if self.prog.0.len() < 1 { if self.prog.len() < 1 {
self.running = false; self.running = false;
} }
while self.ictr < self.prog.0.len() { while self.ictr < self.prog.len() {
if self.err_state || !self.running { if self.err_state || !self.running {
return; return;
} }
@ -154,10 +158,10 @@ impl VM {
} }
} }
if self.ictr > self.prog.0.len() { if self.ictr > self.prog.len() {
e!("attempt to execute out of bounds instruction"); e!("attempt to execute out of bounds instruction");
} }
let instr = &self.prog.0[self.ictr].clone(); let instr = &self.prog[self.ictr].clone();
// get or set according to addressing mode // get or set according to addressing mode
macro_rules! access { macro_rules! access {
@ -194,7 +198,7 @@ impl VM {
e!("illegal argument to jump"); e!("illegal argument to jump");
}; };
if target >= self.prog.0.len() { if target >= self.prog.len() {
e!("out of bounds jump caught"); e!("out of bounds jump caught");
} }
@ -616,7 +620,7 @@ impl VM {
mod tests { mod tests {
use super::*; use super::*;
use crate::instr as i; use crate::instr as i;
use crate::util::{Program, Instruction, Operand}; use crate::serializer::{Program, Instruction, Operand};
use core::ops::Deref; use core::ops::Deref;
use organelle::Float; use organelle::Float;
@ -699,7 +703,7 @@ mod tests {
#[test] #[test]
fn isa_trap_tests() { fn isa_trap_tests() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::TRAP, vec![Operand(Address::Numer, 0)]) Instruction(i::TRAP, vec![Operand(Address::Numer, 0)])
])).with_traps(Some(vec![ ])).with_traps(Some(vec![
Arc::from(|state: &mut VM| { Arc::from(|state: &mut VM| {
@ -720,7 +724,7 @@ mod tests {
#[test] #[test]
fn isa_symtable_tests() { fn isa_symtable_tests() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::BIND, vec![Operand(Address::Stack, 1), Instruction(i::BIND, vec![Operand(Address::Stack, 1),
Operand(Address::Stack, 0)]), Operand(Address::Stack, 0)]),
])).with_stack(Some({ ])).with_stack(Some({
@ -743,7 +747,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::BIND, vec![Operand(Address::Stack, 1), Instruction(i::BIND, vec![Operand(Address::Stack, 1),
Operand(Address::Stack, 0)]), Operand(Address::Stack, 0)]),
Instruction(i::BOUND, vec![Operand(Address::Stack, 1)]) Instruction(i::BOUND, vec![Operand(Address::Stack, 1)])
@ -767,7 +771,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::BIND, vec![Operand(Address::Stack, 1), Instruction(i::BIND, vec![Operand(Address::Stack, 1),
Operand(Address::Stack, 0)]), Operand(Address::Stack, 0)]),
Instruction(i::UNBIND, vec![Operand(Address::Stack, 1)]) Instruction(i::UNBIND, vec![Operand(Address::Stack, 1)])
@ -794,7 +798,7 @@ mod tests {
#[test] #[test]
fn isa_stack_tests() { fn isa_stack_tests() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 4)]), Operand(Address::Numer, 4)]),
Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]) Instruction(i::PUSH, vec![Operand(Address::Expr, 0)])
@ -812,7 +816,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 4)]), Operand(Address::Numer, 4)]),
Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]), Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]),
@ -834,7 +838,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 4)]), Operand(Address::Numer, 4)]),
Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]), Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]),
@ -857,7 +861,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 4)]), Operand(Address::Numer, 4)]),
Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]), Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]),
@ -886,7 +890,7 @@ mod tests {
#[test] #[test]
fn isa_load_dupl_clear() { fn isa_load_dupl_clear() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 4)]), Operand(Address::Numer, 4)]),
Instruction(i::LINK, vec![Operand(Address::Expr, 0), Instruction(i::LINK, vec![Operand(Address::Expr, 0),
@ -915,7 +919,7 @@ mod tests {
#[test] #[test]
fn isa_nop_halt_panic() { fn isa_nop_halt_panic() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::NOP, vec![]) Instruction(i::NOP, vec![])
])); ]));
@ -929,7 +933,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::HALT, vec![]), Instruction(i::HALT, vec![]),
Instruction(i::PUSH, vec![Operand(Address::Numer, 1)]) Instruction(i::PUSH, vec![Operand(Address::Numer, 1)])
])); ]));
@ -944,7 +948,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::PANIC, vec![Operand(Address::Stack, 0)]) Instruction(i::PANIC, vec![Operand(Address::Stack, 0)])
])).with_stack({ ])).with_stack({
let mut i = StackStack::<Gc<Datum>>::new(); let mut i = StackStack::<Gc<Datum>>::new();
@ -965,7 +969,7 @@ mod tests {
#[test] #[test]
fn isa_inc_dec() { fn isa_inc_dec() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 4)]), Operand(Address::Numer, 4)]),
Instruction(i::INC, vec![Operand(Address::Expr, 0)]), Instruction(i::INC, vec![Operand(Address::Expr, 0)]),
@ -981,7 +985,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 4)]), Operand(Address::Numer, 4)]),
Instruction(i::DEC, vec![Operand(Address::Expr, 0)]), Instruction(i::DEC, vec![Operand(Address::Expr, 0)]),
@ -1000,7 +1004,7 @@ mod tests {
#[test] #[test]
fn isa_jmp_jmpif() { fn isa_jmp_jmpif() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::JMP, vec![Operand(Address::Instr, 2)]), Instruction(i::JMP, vec![Operand(Address::Instr, 2)]),
Instruction(i::HALT, vec![]), Instruction(i::HALT, vec![]),
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
@ -1017,7 +1021,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::LINK, vec![Operand(Address::Stack, 0), Instruction(i::LINK, vec![Operand(Address::Stack, 0),
Operand(Address::Expr, 0)]), Operand(Address::Expr, 0)]),
Instruction(i::JMPIF, vec![Operand(Address::Instr, 3)]), Instruction(i::JMPIF, vec![Operand(Address::Instr, 3)]),
@ -1040,7 +1044,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::LINK, vec![Operand(Address::Stack, 0), Instruction(i::LINK, vec![Operand(Address::Stack, 0),
Operand(Address::Expr, 0)]), Operand(Address::Expr, 0)]),
Instruction(i::JMPIF, vec![Operand(Address::Instr, 3)]), Instruction(i::JMPIF, vec![Operand(Address::Instr, 3)]),
@ -1063,7 +1067,7 @@ mod tests {
vm.run_program(); vm.run_program();
assert!(case.test_passes(&vm)); assert!(case.test_passes(&vm));
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::JMP, vec![Operand(Address::Instr, 300)]), Instruction(i::JMP, vec![Operand(Address::Instr, 300)]),
])).with_stack({ ])).with_stack({
let mut i = StackStack::new(); let mut i = StackStack::new();
@ -1085,7 +1089,7 @@ mod tests {
#[test] #[test]
fn isa_conversions() { fn isa_conversions() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
// load from stack into expr // load from stack into expr
Instruction(i::DUPL, vec![Operand(Address::Stack, 0), Instruction(i::DUPL, vec![Operand(Address::Stack, 0),
Operand(Address::Expr, 0)]), Operand(Address::Expr, 0)]),
@ -1140,7 +1144,7 @@ mod tests {
#[test] #[test]
fn isa_consts() { fn isa_consts() {
let mut vm = VM::from(Program(vec![ let mut vm = VM::from(Program(vec![], vec![
Instruction(i::CONST, vec![Operand(Address::Expr, 0), Instruction(i::CONST, vec![Operand(Address::Expr, 0),
Operand(Address::Numer, 1)]), Operand(Address::Numer, 1)]),
Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]), Instruction(i::PUSH, vec![Operand(Address::Expr, 0)]),

View file

@ -8,9 +8,28 @@ project: a POSIX shell interpreter as well as a compiled to bytecode language fo
running on ESP32 devices. running on ESP32 devices.
## Current Status ## Current Status
Currently the lexer and parser are implemented. On an X86 machine equipped with 64GB The lexer and parser are implemented. On an X86 machine equipped with 64GB RAM
RAM and an AMD Ryzen 7900 CPU this lexer and parser are capable of creating a fully and an AMD Ryzen 7900 CPU this lexer and parser are capable of creating a fully
validated abstract syntax tree from approximately 11200 lines of handwritten scheme validated abstract syntax tree from approximately 11200 lines of handwritten scheme
in about 55 milliseconds on average. in about 55 milliseconds on average.
Currently the bytecode VM and its instruction set are next to implement.
HyphaeVM is mostly implemented. The instruction set is defined and implemented,
including extensibility interfaces and the VM layout. Additionally, instruction
encoding and decoding are implemented. Garbage collection is implemented (via
reference counting). Currently being implemented are datum encoding/decoding and
full program encoding/decoding. Yet to be approached are debugging routines, CLI
utilities, and concurrency features. However, Documentation has been written on
programming with HyphaeVM.
The R7RS-Small Scheme to HyphaeVM compiler is not implemented.
R7RS-Large is not implemented.
The Linux/Mac/Windows runtime and extended compiler is not implemented.
Documentation is not implemented.