flesh/Readme.org
2023-05-25 15:00:14 -07:00

24 KiB

Relish: Rusty Expressive LIsp SHell

Note: this document is best read using a dedicated ORG mode editor

Purpose statement

The purpose of Relish is to create a highly portable, easy to integrate language that can be used in many environments.

Goals

  • Iterate on the ideas and designs that were tested with SHS
  • Create a usable POSIX shell
  • Create usable applications/scripts
  • To have quality code coverage
  • No unsafe code outside of the POSIX module

Stretch Goals

  • Create an interpreter that can be booted on one or more SOCs
  • Create an interpreter that can be embedded in another application
  • Create UI bindings

How to use

Syntax

S-Expressions

Relish fits within the LISP family of languages alongside venerable languages like Scheme or Common Lisp. Lisps are HOMOICONIC which means that the code is data, and that there is a direct correlation between the code as written and the program as stored in memory. This is achieved through S-EXPRESSIONS. An S-Expression (or symbolic expression) is a tree of nested lists. Programs in Relish (and most other lisps) are written with S-Expressions, and are then represented in memory as trees of nested linked lists.

An example:

  (top-level element1 "element2" 3 (nested 2 5 2) (peer-nested))

As in memory

top-level -> element1 -> "element2" -> 3 -> [] -> [] ->
                                            |     ^-> peer-nested ->
                                            \-> nested -> 2 -> 5 -> 2 ->

Each node in memory has type information and potentially a cooresponding entry in a global symbol table.

Data types

Relish leverages the following data types:

  • Strings: delimited by ', ", or `
  • Integers: up to 128 bit signed integers
  • Floats: all floats are stored as 64 bit floats
  • Booleans: true or false
  • Symbols: an un-delimited chunk of text containing alphanumerics, -, _, or ?

Symbols and Functions can contain data of any type. there is no restriction on what can be set/passed to what….. However, internally Relish is typed, and many builtin functions will get very picky about what types are passed to them.

Calling a function

S-Expressions can represent function calls in addition to trees of data. A function call is a list of data starting with a symbol that is defined to be a function:

(dothing arg1 arg2 arg3)

Function calls are executed as soon as the tree is evaluated. See the following example:

(add 3 (add 5 2))

In this example, (add 5 2) is evaluated first, its result is then passed to (add 3 ...). In infix form: 3 + (5 + 2).

Control flow

If

An if form is the most basic form of conditional evaluation offered by Relish. It is a function that takes lazily evaluated arguments: a condition, a then clause, and an else clause. If the condition evaluates to true, the then clause is evaluated and the result returned. Otherwise the else clause is evaluated and the result is returned. If the condition evaluates to neither true nor false (a non-boolean value) a type error is returned.

  ;; simple condition
  (if true
      (echo "its true!")
      (echo "its false!"))

  ;; more advanced condition, with hypothetical data
  (if (get-my-flag global-state)
      (echo "my flag is already on!")
      (turn-on-my-flag global-state))

While

Another popular control flow structure is the while loop. This is implemented as a condition followed by one or more bodies that are lazily evaluated only if the condition is true. Like the if form, if the conditional returns a non-boolean value the while loop will return an error.

  (while (get-my-flag global-state) ;; if false, returns (nothing) immediately
    (dothing)                       ;; this is evaluated
    "simple token"                  ;; this is also evaluated
    (toggle-my-flag global-state))  ;; this is also evaluated

Let

Let is one of the most powerful forms Relish offers. The first body in a call to let is a list of lists. Specifically, a list of variable declarations that look like this: (name value).

Each successive variable definition can build off of the last one, like this: ((step1 "hello") (step2 (concat step1 " ")) (step3 (concat step2 "world"))). In said example, the resulting value of step3 is "hello world". After the variable declaration list, the next form is one or more unevaluated trees of code to be evaluated. Here is an example of a complete let statement using hypothetical data and methods:

  ;; Example let statement accepts one incoming connection on a socket and sends one response
  (let ((conn (accept-conn listen-socket))                      ;; start the var decl list, decl first var
        (hello-pfx "hello from ")                               ;; start the var decl list, declare second var
        (hello-msg (concat hello-pfx (get-server-name)))        ;; declare third var from the second var
        (hello-response (make-http-response 200 hello-msg)))    ;; declare fourth var from the third, end list
    (log (concat "response to " (get-dst conn) ": " hello-msg)) ;; evaluates a function call using data from the first and third vars
    (send-response conn hello-response))                        ;; evaluates a function call using data from the first and fourth vars

Here you can see the usefulness of being able to declare multiple variables in quick succession. Each variable is in scope for the duration of the let statement and then dropped when the statement has concluded. Thus, it is little cost to break complex calculations down into reusable parts.

Circuit

Circuit is useful to run a sequence of commands in order. A call to circuit comprises of one or more forms in a sequence. All forms in the call to circuit are expected to evaluate to a boolean. The first form to evaluate to false halts the sequence, and false is returned. If all forms evaluate to true, true is returned.

Example:

  (circuit
     (load my-shell-command) ;; exit 0 casted to true, also: requires CFG_RELISH_POSIX
     (get-state-flag global-state)
     (eq? (some-big-calculation) expected-result))

Not quite control flow

Several other functions use lazy evaluation of their arguments. The below list is non-exhaustive:

  • toggle
  • inc
  • dec

These functions are mentioned here for their use with control flow.

  • inc: increment a symbol by one
  • dec: decrement a symbol by one
  • toggle: flip a symbol from true to false, or vice versa

For more information on these functions consult the output of the help function:

λ (help toggle)
NAME: toggle

ARGS: 1 args of any type

DOCUMENTATION:

switches a boolean symbol between true or false.
Takes a single argument (a symbol). Looks it up in the variable table.
Either sets the symbol to true if it is currently false, or vice versa.

CURRENT VALUE AND/OR BODY:
<builtin>

Quote and Eval

As stated previously: Lisp, and consequently Relish, is homoiconic. This means that code can be passed around (and modified) as data. This allows us to write self programming programs, or construct entire procedures on the fly. The primary means to do so are with quote and eval. The quote function allows data (code) to be passed around without evaluating it. It is used to pass unevaluated code around as data that can then be evaluated later. To be specific, typing (a) usually results in a symbol lookup for a, and then possibly even a function call. However, if we quote a, we can pass around the symbol itself:

  (quote a)         ;; returns the symbol a
  (quote (add 1 2)) ;; returns the following tree: (add 1 2)
  (q a)             ;; returns the symbol a

(note that quote may be shortened to q)

We can use this to build structures that evaluate into new data:

  (let ((mylist (q (add)))                  ;; store a list starting with the add function
        (myiter 0))                         ;; store an iterator starting at 0
     (while (lt? myiter 4)                  ;; loop until the iterator >= 4
       (inc myiter)                         ;; increment the iterator
       (def mylist '' (cons mylist myiter)) ;; add to the list
       (echo mylist))                       ;; print the current state of the list
     (echo (eval mylist)))                  ;; print the eval result

Notice the final body in the let form: (echo (eval mylist)) The above procedure outputs the following:

  (add 1)
  (add 1 2)
  (add 1 2 3)
  (add 1 2 3 4)
  10

Lambda

Another form of homoiconicity is the anonymous function. This is a nameless function being passed around as data. It can be bound to a variable, or called directly. An anonymous function is created with the lambda function.

Here is an example of a lambda function:

  (lambda (x y) (add x y))
  ;;       |     ^ this is the function body
  ;;       +-> this is the argument list

The result of the lambda call is returned as a piece of data. It can later be called inline or bound to a variable.

Here is an example of an inline lambda call:

  ((lambda (x y) (add x y)) 1 2)

This form (call) evaluates to (returns) 3.

Here is the lambda bound to a variable inside a let statement:

  (let ((adder (lambda (x y) (add x y)))) ;; let form contains one local var
            (adder 1 2))                  ;; local var (the lambda 'adder') called here

Defining variables and functions

In Relish, both variables and functions are stored in a table of symbols. All Symbols defined with def are GLOBAL. The only cases when symbols are local is when they are defined as part of let forms or as arguments to functions. In order to define a symbol, the following arguments are required:

  • A name
  • A docstring (absolutely required)
  • A list of arguments (only needed to define a function)
  • A value

Regarding the value: A function may be defined with several trees of code to execute. In this case, the value derived from the final form in the function will be returned.

  (def my-iter 'an iterator to use in my while loop' 0) ;; a variable
  (def plus-one 'adds 1 to a number' (x) (add 1 x))     ;; a function
  (def multi-func 'example of multi form function'
       (x y)              ;; args
       (inc my-iter)      ;; an intermediate calculation
       (add x y my-iter)) ;; the final form of the function. X+Y+MYITER is returned

Make sure to read the Configuration section for information on how symbols are linked to environment variables.

Naming conventions

  • Symbol names are case sensitive
  • Symbols may contain alphanumeric characters
  • Symbols may contain one or more of the following: - _ ?
  • The idiomatic way to name symbols is all-single-case-and-hyphenated

Undefining variables and functions

Removing a symbol consists of a call to def with no additional arguments:

(def my-iter 'an iterator' 0)
(inc my-iter) ;; my-iter = 1
(def my-iter) ;; removes my-iter
(inc my-iter) ;; UNDEFINED SYMBOL ERROR

Builtin functions

As opposed to listing every builtin function here, it is suggested to the user to do one of two things:

Documentation

Tests

Most of the tests evaluate small scripts (single forms) and check their output. Perusing them may yield answers on all the cases a given builtin can handle. The test directory

Help function

Relish is self documenting. The help function can be used to inspect any variable or function. It will show the name, current value, docstring, arguments, and definition of any builtin or user defined function or variable.

> (help my-adder)
NAME: my-adder

ARGS: 2 args of any type

DOCUMENTATION:

adds two numbers

CURRENT VALUE AND/OR BODY:
args: x y 
form: ((add x y))
> (help CFG_RELISH_ENV)
NAME: CFG_RELISH_ENV

ARGS: (its a variable)

DOCUMENTATION:

my env settings

CURRENT VALUE AND/OR BODY:
true

Every single symbol in Relish can be inspected in this way, unless some third party developer purposefully left a docstring blank.

Snippets directory

The snippets directory may also yield some interesting examples. Within it are several examples that the authors and maintainers wanted to keep around but didnt know where. It is sort of like a lint roller. It also contains considerably subpar implementations of Relish's internals that are kept around for historical reasons.

Userlib

The Userlib was added as a script containing many valuable functions such as set and prepend. You can use it by calling it in your shell config (See The minimal shell configuration example for more info).

Easy patterns

This section contains examples of common composites of control flow that can be used to build more complex or effective applications More ideas may be explored in the snippets directory of this project. The author encourages any users to contribute their own personal favorites not already in this section either by adding them to the snippets folder, or to extend the documentation here.

while-let combo

          ;;  myiter = (1 (2 3 4 5 6))
          (def myiter 'iterator over a list' (head (1 2 3 4 5 6)))

          ;; iterate over each element in mylist
          (while (gt? (len (cdr myiter)) 0)   ;; while there are more elements to consume
            (let ((elem (car myiter))         ;; elem = consumed element from myiter
                  (remaining (cdr myiter)))   ;; remaining = rest of elements
              (echo elem)                     ;; do a thing with the element, could be any operation
              (def myiter (head remaining)))) ;; consume next element, loop

The while-let pattern can be used for many purposes. Above it is used to iterate over elements in a list. It can also be used to receive connections to a socket and write data to them.

let destructuring

let is very useful for destructuring complex return types. If you have a function that may return a whole list of values you can then call it from let to consume the result data. In this example a let form is used to destructure a call to head. head returns a list consisting of (first-element rest-of-list) (for more information see (help head)). The let form starts with the output of head stored in head-struct (short for head-structured). The next variables defined are first and rest which contain individual elements from the return of the call to head. Finally, the bodies evaluated in the let form are able to operate on the head and the rest.

  ;; individually access the top of a list
  (let ((head-struct (head (1 2 3))
      (first (car head-struct))
      (rest (cdr head-struct)))
     (echo "this is 1: " first)
     (echo "this is 2, 3: " rest))

if-set?

One common pattern seen in bash scripts and makefiles is the set-variable-if-not-set pattern.

  MYVAR ?= MY_SPECIAL_VALUE

Translated, can be seen below

    (if (set? myvar)
        () ;; no need to do anything... or add a call here
        (def myvar "my variable explanation..." "MY_SPECIAL_VALUE"))

Alternatively this combination can be used to process flags in a script or application:

  (if (set? myflag)
      (process-flag myflag)
      ())

Configuration

By default Relish will read from ~/.relishrc for configuration, but the default shell will also accept a filename from the RELISH_CFG_FILE environment variable. See the minimal shell configuration example for an example of a basic configuration file. Other snippets, including mood-prompt and avas-laptop-prompt demonstrate more complex configurations.

The configuration file

The configuration file is a script containing arbitrary Relish code. On start, any shell which leverages the configuration code in the config module (run.rs) will create a clean seperate context, including default configuration values, within which the standard library will be initialized. The configuration file is evaluated and run as a standalone script and may include arbitrary executable code. Afterwards, configuration values found in the variable map will be used to configure the standard library function mappings that the shell will use. Errors during configuration are non-terminal. In such a case any defaults which have not been overwritten will remain present.

Important points to note
  • When the configuration file is run, it will be run with default configuration values.
  • The user/script interpreter will be run with the standard library configured to use the previously defined configuration variables.
  • The standard library will then be re-processed and re-added to the symbol table with new configuration.
  • Variables and functions defined during configuration will carry over to the user/script interpreter, allowing the user to load any number of custom functions and variables.

Configuration Values

  • CFG_RELISH_POSIX (default false): when true, enables POSIX style job control.
  • CFG_RELISH_ENV (default true): when true, interpreter's variable table and environment variable table are kept in sync.
  • CFG_RELISH_L_PROMPT (default 'λ'): a function that is called with no arguments to output the left hand of the prompt
  • CFG_RELISH_R_PROMPT (default ''): a function that is called with no arguments to output the right hand of the prompt
  • CFG_RELISH_PROMPT_DELIMITER (default '>'): a function that is called with no arguments to output the delimiter separating prompt from user input

Prompt design

For an example of prompt design see the mood prompt For a more complex example see Ava's laptop prompt

Further configuration

Further configuration can be done by loading scripts that contain more functions and data to evaluate. Variables and functions defined in an external script loaded by your interpreter will persist in the symbol table.

  (call "my-extra-library-functions.rls")

Using Relish as your shell

As of version 0.3.0 Relish implements all the features of an interactive shell. See further documentation in the shell documentation.

Compilation

  cargo build

Testing

  cargo test

Running (the main shell)

  cargo run src/bin/main.rs

The codebase

The tests directory

Start here if you are new.

Eval tests

These are particularly easy to read and write tests. They primarily cover execution paths in the evaluation process.

Func tests

These tests extend the eval tests to cover the co-recursive nature between eval and func calls.

Lex tests

These tests verify the handling of syntax.

Lib tests: (tests/test_lib*)

These tests are unique per stdlib module and work to prove the functionality of builtin functions in the language.

Source directory

This directory contains all of the user facing code in relish.

Just a few entries of note:

Segment module

This file lays out the data structures that the interpreter operates on. Representation of code trees, traversals, and type annotations all live here.

lib.rs

This defines a library that can be included to provide an interpreter interface within any Rust project. The components defined here can certainly be used to support language development for other LISP (or non LISP) langauges.` Your project can use or not use any number of these components.

Symbol module

This file contains all code related to symbol expansion and function calling. The types defined in this file include SymTable, Args, Symbol, and more. Code to call Lambda functions also exists in here.

Run module

This file contains functions which load and run the configuration file script. For more information see the configuraiton section above in this Readme.

Standard library module

This defines the static_stdlib function and the dynamic_stdlib function. The static_stdlib function loads all symbols in the standard library which do not need further configuration into the symbol table. The dyanmic_stdlib function loads all symbols in the standard library which do need configuration into the symbol table. The dynamic_stdlib function uses variables saved in the symbol table to configure the functions and variables it loads. This file also contains default configuration values. Any new addition to the stdlib must make its way here to be included in the main shell (and any other shell using the included stdlib functions). You may choose to override these functions if you would like to include your own special functions in your own special interpreter, or if you would like to pare down the stdlib to a lighter subet of what it is.

You can view the code for standard library functions in the standard library directory.

binary directory

This contains any executable target of this project. Notably the main shell.

Current Status / TODO list

Note: this section will not show the status of each item unless you are viewing it with a proper orgmode viewer. Note: this section only tracks the state of incomplete TODO items. Having everything on here would be cluttered.

TODO alpha tasks

  • exit function (causes program to shut down and return code)
  • probably push the symtable inserts into functions in individual stl modules (like posix.rs does)
  • tag and release v0.3.0

TODO v1.0 tasks

  • Be able to use features to compile without env or posix stuff

    • add a new binary target that is a simple posix repl demo
    • I think you'll need to refactor userlib to hide set behind a conditional based on whether CFG_RELISH_POSIX is set or not
    • add a compilation task to CI
  • finish basic goals in the interactive development library
  • Investigate has_next member function for &Seg and maybe simplify stdlib and probably also eval/sym
  • Rename to Flesh
  • Can pass args to relish scripts (via interpreter)
  • Can pass args to relish scripts (via command line)
  • Scripts can use shell
  • History length configurable (env var?)
  • Lex function
  • Read function (Input + Lex)
  • Make an icon if you feel like it

TODO v1.1 tasks (Stable 1)

  • finish stretch goals in the interactive development library
  • execute configurable function on cd
  • Post to relevant channels
  • Implement Compose for lambdas (add to readme)
  • File operations

    • read-to-string
    • write-to-file
    • file exists?
    • (add this all to the readme)
  • color control library

    • probably more escapes in the lexer
    • just a snippet with a bunch of color constants
  • Search delim configurable

TODO v1.2 release tasks

  • Emacs syntax highlighting and/or LSP implementation
  • Bindings for the simplest possible UI library?
  • Get on that bug tracke
  • Network library

    • HTTP Client
    • TCP Stream client
    • UDP Client
    • TCP Listener
    • HTTP Listener
    • UDP Listener

Special thanks

Special thanks to 'Underscore Nul' for consulting with me in the early stages of this project. Meeting my goal of only using safe rust (with the exception of the posix module) would have been a much bigger challenge if not for having someone to experiment on design ideas with.