This is a working draft of the design rationale and formal specification of the Lynx Programming Language

Part I – Design

What is Programming?
Programming Interface

Human Representations
Why Not Code?
Machine Representations

Programming Ecosystem

The Mess We're In

Software Correctness

Part II – Specification

The Object-Context Paradigm
Type Hierarchy
Statics

Abstract Syntax
Typing Judgements

Dynamics
Types

Equality
Subtyping
Canonicity
Approximations

State
Events

Part III – Implementation

Metacircularity
Persistance
Primitives
Modules
Compilation
Parallelism
Errors
Optimization
Approximation

Part I – Design

Lynx is a computable specification language designed to make building great software more intuitive and efficient. Programming is fundamentally about human communication. After a brief discussion about the philosophy behind Lynx and what a programming ecosystem should look like, this document will detail the formal specification of the Lynx language.

What is Programming?

When most people think about programming they think about writing code. Instead, it should be thought of in a more abstract way as the act of communicating complex ideas precisely. With traditional programming languages, programmers are really doing two separate acts at the same time: Writing code to communicate with computers and writing documentation to communicate with other programmers. These two parts are held together with an informal set of best practices that are but they should be unified within a programming environment.

Programming Interface

Why not code?

Traditional text-based coding is the standard that almost every general purpose programming language uses because of its expressive power and flexibility. It provides a consistent interface for every program and the text that is visible is the entirety of the implementation. The few graphical programming languages that exist fall short because they try to replicate the semantics of text-based code but in graphical form. Lynx takes a different approach.

why graphical is good-- edit locality don't need to look through source code for where something is implemented.

implementation agnostic

Functional programming focuses on a single type of relations between objects: How an object is evaluated. While this is is an interesting property of an object, and probably the most useful from a computational perspective, it makes programming in a functional style difficult. The solution is to add a generalization of relations between objects where the concept of evaluation is only one of them.

Lynx offers features for determining how the result of a program was defined built into the language. Say a programmer is implementing a collision detection algorithm but it isn't working properly...introspect into how the collision boolean was defined instead fo stepping forward through imperative code to the point of the error When combined with graphical representations of each of these concepts this forms an extremely powerful debugging tool

Keep track of definitions. In most languages, computation is uni-directional. Lynx keeps track of where values come from so that it can be audited at a later time. This feature creates an extremely powerful tool for debugging programs. Instead of a programmer needing to understand the whole structure of a program and work forward to a bug, theycan see where there is an incorrect value and then work backwards. Keeping track of definitions also allows for more intuitive update patterns where the programmer doesn't have to explicitly trigger an update. This improves the separation between specification and implementation, prevents bugs, and improves possibilities for optimization.

Sum types --good for checking where a program must be updated

The correctness of lynx programs is defined by a single guiding principle: All information justifying the correctness of a program should be part of the program. This includes documentation, implied information in the code and detailed information about the compiler.

When designing a line intersection algorithm, the programmers should be able to see the lines and not have it hidden. if zipping two lists then there should be some justification that they are the same length and this justification should be part of the program When using floating point numbers, the programmer should know specify that the final result will be within certain error bounds.

Part II – Specification

The Object-Context Paradigm

With lynx we introduce a new programming paradigm called the object-context paradigm. The object oriented style matches the way people think about concepts in the real world more closely than functional or imperative programming styles. This is one reason for the relative popularity of object oriented programming. Lynx uses a carefully constrained form of Object Oriented programming so that programs have a more intuitive understanding while still allowing for the more strict program analysis like in a functional paradigm. Lynx can also be thought of as a purely declarative form of object oriented programming.

Objects in Lynx consist of a mapping between attributes and values. Both attributes and values are also objects. One way to represent objects graphically is with a tree diagram where the objects are nodes and the attributes are edges.

Objects can have multiple attributes that are represented as different branches of the tree. and because values are also objects the tree can be multiple levels deep. This forms a recursive data type that can be used to represent any other data type in lynx.

We will use text being drawn on a screen as an example. The text can have several attributes like its position on the screen, the string to be printed, font type, font size etc. We can also attach metadata like the name of the textbox as an attribute.

Every tree or subtree in this diagram is an object. So far, this is just describes a static graphical layout. Lynx is able to do computation when we introduce context and references.

We can describe a text object (text box) whose position is defined by the graphical element it is put inside. We will call the "parentElement" of the text box "group" and then we will put the group element inside of a window.

This can be represented by the lynx object tree:

In this case the text object (text and everything below it) is placed in the context of the group. The text object has two references that refer to values outside of the text object. The reference "group.position.x" refers to the the x component of the position of the group object. The text object will always behave the same way if it is placed in an identical "group" context. We can think of an object as a pure function and the references to the context it is in corresponds to the arguments of that pure function. So now if we change the position of the "group" element then the text will follow and always be offset by \((20, 20)\)

We've now made these definitions dynamic but with this system "text" is not really isolated from "group" because the references still refer to "group" by name. It would be nicer if we could say that the position of "text" is relative to whatever it's parent element is. To solve this problem we can define a concept called an inverse attribute. In this case "parentElement" is the inverse attribute of "childElement"

With inverse attributes, the definition of "text" depends on the context it is in. Its parentElement can have any name but it must have a position with x and y attributes. We can again think of text as a pure function with these references as arguments.

This forms the core part of the lynx language. Next, we will look at how these object graphs can change over time in response to new data and store state.

Events and State

So far, we have discussed how static lynx objects are structured and evaluated. They can also change in response to outside events and store data. We'll start with a simple, non-stateful example where the position of the text box is set to be the position of the user's mouse.

Type Hierarchy

Lynx should be thought of more as a specification language than a programming language because it is designed with an extra layer of abstraction that makes it possible to manipulate meta-data about programs. In Lynx, any concept is considered to be a subtype of something called a Lynx Object. The type hierarchy is a partial order on the "is A" relationship on objects. This generally behaves in the same way as types in traditional languages. For example, 3 is A number that will be written with the notation \(3:Number\). We will use \(e\) and \(\tau\) as as general placeholders for the typing statement \(e:\tau\). For practical purposes in the subtyping partial order, Lynx Object is the single the greatest element. This is not strictly true because there is some machinery to avoid paradoxes so some of the details are suppressed here for clarity. Instances of objects will be referred to as \(Atoms\) are the least elements of the type hierarchy. Other than their name, there is no distinction between instances and types in Lynx. For example, the number "3" is just a subtype of the type "number". Most languages have the distinction between types and values to draw a line between concepts that can be checked at compile time and then thrown away. This means that when using dependent types if we want to use the type Natural numbers to measure the length of the array then we would ordinarily need a natural number type for use at runtime and a natural number kind to use at compile time. Lynx instead allows for there to be no distinction between compile time and run time but forces expression evaluation to be typed to make sure that the correct information is being discharged at run time.

We can divide the type hierarchy into 4 general sections:

Lynx Objects: a kind of way of storing meta-data and relations between objects. In the Lynx type system, everything is a subtype of Lynx Object
Abstract Types: types like product types, sum types and function types are defined in terms of lynx expressions and are therefore a subtype of lynx objects.
Concrete Types: numbers, strings lists, trees, etc. are subtypes of abstract types
Instances: 1,2,3, "abc" are all subtypes of their corresponding types.

For each of these sections of the type hierarchy we will create a specification for each object in two steps:

statics: specification of what makes a program well-formed
dynamics: specification for evaluating a well-formed program

Unlike most languages, Lynx does not have separate phases for write-time, type checking, compile-time and run-time. Therefore the dynamics are defined in such a way that at every step of evaluation the resulting program is also correct as defined by the statics. More formally, given a well-formed Lynx expression \(e\) and a computational event \(\mapsto\) the dynamics ensures that that the resulting Lynx expression \(e'\) is also well-formed for the evaluation expression \(e \mapsto e'\)

Lynx Object

Lynx Objects \((\mathcal{O})\)

\[LynxObject ::= \left\{ a \: |\: a:( \link{#lynx-attribute}{Attribute}, LynxObject) \right\}\]

Lynx Objects consist of a set of mappings between an attribute and another Lynx Object. The definition is intentionally broad because they are designed to encapsulate the properties, metadata, and relations of any concept. They are the only first-class concepts in the lynx language. Note that Lynx objects don't inherently contain any information. Their meaning is only derived by their relation to other objects. To highlight why this design decision is important, we can use the concept of a function as an example. Something like a function that would normally be a first class object in a programming language is not in lynx because it would be missing important meta-data. Lynx objects can capture the essential information about functions by defining an "argument" and a "result" attribute that can be used by the evaluator. The name of the function other meta-data can also be added as an attribute. The inverse of the function can be added as a relation. The structure of Lynx Objects form a graph where the objects are nodes and the attributes are edges.

Attribute \((\rightarrow)\)

\[Attribute ::= \left\{ a \: |\: a:(Attribute, \link{#lynx-object}{LynxObject}) \right\}, \link{#lynx-property}{Property} \:|\: \link{#relation}{Relation}\] Attributes are a subtype of Lynx Objects. In most of the lynx object graphs shown attributes are just edges for clarity but attributes can have their own attributes just like any lynx object. These attributes of attributes can imply more structure like that the inverse of

Lynx Expression \((e)\)

\[LynxExpression ::= \left\{ a \: |\: a:(\link{#lynx-property}{Property}, LynxExpression \: | \: Atom \: | \: Reference) \right\}\]

Lynx Expressions are an inductively defined tree structure consisting of a mapping between a finite set of Properties and corresponding Lynx Expressions. Lynx Expressions are useful for the serialization, analysis and storage of Lynx Objects. Lynx expression can always be serialized and content addressed because they are non-cyclic by definition. This is covered in more detail in the Object Persistance specification. Lynx Objects can be simplified to Lynx Expressions with respect to certain properties -- or with respect to certain types?. For the remainder of the whitepaper Lynx Expressions will be denoted with the letter \(e\) with subscripts \(e_1, e_2 \) etc. to differentiate between unequal expressions.

Converting between Lynx Expressions and Lynx Objects

Lynx Objects can be split into a Lynx Expression\((e)\) and a context\((\Gamma)\) with respect to a given type. and are the composition of a Lynx Expression and some context, but to be able to store and analyze Lynx Expressions there also must be a process for decomposing a Lynx Object into a Lynx Expression and a context. Lynx Objects are a possibly infinite set of \((\link{#lynx-object}{LynxObject}, \link{#attribute}{Attribute})\) pairs.

parameterize attributes
split attributes so properties are part of the expression and non essential relations are in the context

Dependencies

Closed Expressions

Open Expressions

Lynx Value \((v)\)

\[LynxValue ::= \left\{ a \: |\: a:(\link{#lynx-property}{Property}, LynxValue \: | \: Atom) \right\}\] A Lynx Value is a subtype of a Lynx Expression where every property is also a Lynx Value. Lynx values are often used as the base case for evaluation rules.

Other cases include Lynx Expressions without external dependencies but contain expressions within the sub tree.

Context \(\left(\Gamma\right)\)

\[Context (\Gamma) ::= \{a \:|\: a:(Attribute, LynxObject)\}\] such that \[Context \cup LynxExpression = Lynx Object\] More informally, a Lynx Expression placed into a context forms a Lynx Object. The behavior of the expression is a pure function of the context it is placed in. The set of references within a Lynx Epression that depend on the context will be called arguments, and are discussed in more detail in the implementation section.

Properties

\[Property ::= \link{#lynx-expression}{LynxExpression}\] A lynx property is a non-cyclic version of an attribute. It can not have relations to other properties or attributes.

Relations

\[Relation ::= \link{#lynx-object}{Lynx Object}\]

Relative References \((r)\) </>

\[RelativeReference ::= \left\{(rootObject, Root), (attribute, \link{#attribute}{Attribute}) \right\}\] where \[Root = \link{#lynx-object}{LynxObject} \:| \: RelativeReference \: | \: \link{#local-reference}{LocalReference} \: | \: \link{global-reference}{GlobalReference}\] \(rootObject:Attribute\) and \(attribute:Attribute\) Relative references are just lynx expression with two essential attributes. The \(rootObject\) attribute refers to the object to get \(attribute\) with respect to. For example, if we want to get the \(length\) attribute of a string then string would be the rootObject, length would be the attribute and the reference would have the text notation string.length To generalize, any Lynx Object Root of type \(\tau_0\) that has an attribute attribute of type \(\tau_0 \rightarrow \tau_1\) \[\frac{\Gamma \vdash Root:\tau_0 \quad \mathtt{Attribute}:\tau_0\rightarrow \tau_1}{\mathtt{Root.Attribute} \mapsto e \tau_1}\]

Local References

\[LocalReference ::= (Query, String)\]

Global References

\[GlobalReference ::= (Query, String)\]

Events \((\mapsto)\)

Inheritance

Dynamics

Objects in Lynx are connected by attributes, but to be able to do computation we need to add a few concepts. The big idea is that the program we are running and the result we are trying to arrive at already exist in the interconnected graph of objects and we only need to move from the definition to the result. We will present two equivalent ways of specifying the evaluation of lynx programs. Structural Dynamics vs Evaluation Dynamics The structural dynamics detail the individual evaluation steps. We have not specified the lynx event system yet but each evaluation step is a computation event.

Structural Dynamics

The structural dynamics for Lynx Expressions are defined in terms of computation events of the form \(e_1 \mapsto_\tau e_2\) where \(e_1\) is the initial expression, \(\mapsto\) is the computation event, \(\tau\) is the type in which \(e_1\) is being evaluated in, and \(e_2\) is the resulting expression. For this section we will only be concerned with evaluating Lynx Expressions in the type Lynx Expression. Computation events can be grouped together sequentially to form more complex computation events: \(e_1 \mapsto e_2 =_\mapsto e_1 \mapsto e_{1a} \mapsto e_{1b} \mapsto e_2\). Multiple computation events in sequence is denoted by \(\mapsto^*\).

Safety

To ensure that the program will not encounter any illegal state at runtime, We introduce the concept of safety with two principles: Preservation and progress.

Preservation

If a we have a lynx expression \(e_1:\tau\) and we want to evaluate \(e_1\) in type \(\tau\) to form another expression \(e_2\) \((e_1 \mapsto_\tau e_2)\) then we want to ensure that \(e_2:\tau\)

Progress

If \(e_1:\tau\) then either \(e_1\) is a value in type \(\tau\) or \((e_1 \mapsto_\tau e_2)\). Together, these principles ensure that the program can never enter an illegal state.

Coherence

We want to ensure that evaluating a given expression at the Lynx Expression level will have the same type as evaluating it in a subtype.

Canonicity

Discussed in more detail in the Canonicity section. -cost of events -isomorphism to result attribute -keep track of definitions

Attribute Application

Now we will define a computational event called \(\text{apply}\) that transitions from one Lynx Expression to another with respect to a certain attribute. If we have an expression \(e_1\) then we can use \(\text{apply}\) to create another expression \(e_2\) where \(e_2 = \text{apply}(e_1, \text{attribute})\) The expression \(e_2\) contains some extra information including inverseAttributes and parentValue. This transition is shown with the following Lynx Object graph.

The red arrow is for a special attribute called parentValue. These parentValue attributes effectively keep track of which way is "up" in the object graph. This is used for determining which named references (called localSearch in the code base) are within scope. In the text example "text" and "graph" were in scope for the references because there were implicit parentValue attributes that led from the reference to the text and graph objects. ParentValue doesn't affect the behavior of apply and will be discussed more formally in the evaluation semantics for references.

The \(\text{apply}\) computational event has a few important properties: reversibility, attribute selection, and composability

Reversibility

Every evaluation step is reversible. If we have an expression \(e_2\) defined as \(e_2 =_\mathcal{O} \text{apply}(e_1, \text{attribute})\) and an attribute that is inverse to \(\text{attribute}\) called \(\text{inverseAttribute}\) then we can say \(e_1 =_\mathcal{O} \text{apply}(e_2, \text{inverseAttribute})\)

Notice that the structure of \(e_1\) is not the same as it was originally but its computational behavior will be the same. If we apply \(\text{attribute}\) to \(e_1\) again, it will have an identical structure as \(e_2\). To maintain this property, parentValue attributes are always added pointing toward original position of the cursor before any evaluation steps are taken and are not added when following an inverse attribute. We keep track of this orientation by defining an attribute as inverse when it is pointing up. This way if we are evaluating an inverse attribute we know the cursor is moving up the tree and if we are evaluating a normal attribute then the cursor is moving down the tree. Not every attribute has an inverse so we add an implied inverse attribute called "parentValue" to every evaluation step where we are moving down the tree. We don't add it when an evaluation step moves up the tree to keep the reversibility requirement.

Attribute Selection

In the examples so far the object only has one attribute. If we have an object with two attributes called attribute1 and attribute2 then the evaluation function can select which attribute it follows. In the diagram below it evaluates attribute1 but it could also evaluate attribute2. In the general case it can evaluate any attribute of the cursor object.

Composability

Reference Evaluation

To evaluate references let there be a computational event \(\text{evaluate}\) that transitions a Lynx Expression with the cursor on a reference node to another lynx expression. The "evaluate" computational event is a subtype of the "apply" computational event where the attribute parameter of apply is subclassed with the "result" attribute. The result attribute indicates simplification of an expression and has in inverse called "definition".

Before we can define \(\text{evaluate}\) we need to go into more detail on the structure of reference objects. (in the code base they are sometimes called "get objects") The concise text representation of a reference that we saw before is "text.position" where we are searching for the object "text" by name and then referring to its "position" attribute. These references can be chained so "text.position.x" refers to the x-component of the position of the text. To generalize, the reference "object.attribute" is the text representation for a reference where "object" is the object to be searched for by name and "attribute" is the attribute to follow from object. Objects can be searched for by name within their context.

References are just Lynx objects. They have an attribute "rootObject" and an attribute "attribute". LocalSearch objects have an attribute "query" that is matched against objects in its scope. The object diagram for the reference "object.attribute" looks like this:

Evaluation Dynamics

Abstract Types

Types are used to constrain the behavior of programs and, when they are well-formed, ensure correctness within the boundaries of the type system. In some dynamically typed languages like javascript or python the typing is considered "weak" because there are very few rules about which objects can be used in which places. This is often seen as a feature because it is less confusing for beginners writing simple programs.

Lynx takes the opposite approach. Type structures can always be relaxed selectively but when there is no type structure then the ability to reason about programs is lost. One way of thinking about it is that a weakly typed language can be created in a strongly typed language by purposefully relaxing all of the rules but the rules but a strongly typed language can not be replicated in a weakly typed language so the strongly typed language must be more expressive.

Problem #1: Types specify structure not behavior
Problem #2: Equal types are not necessarily interchangeable

In most languages, abstract types like sum types, product types and function types are first class concepts but in lynx they are subtypes of Lynx Expressions. The meaning of the abstract types is given by their behavior upon evaluation. We will start with the simplest abstract types and then work towards the most general abstract types. A brief note on notation. \(a:A\) means a is a subtype of A.

Atom Type \((\bullet)\)

\[Atom_\tau ::= \{...\} \:| \:\{a \:|\: a:(Attribute^*_\tau, Atom_\tau), ...\}\] Atoms (sometimes called unit types in other languages) are the building blocks of all abstract types in Lynx. They can be considered objects with no additional information. Any single concept you can think of is an atom. For example, the number zero is an atom. So is the number "3" , the letter "a" or the boolean "true". More complicated data structures like the entire friend connection graph on a social network at any given instant are also an atom. Any concept where there is exactly one of that type can be considered an atom with respect to its own type. In Lynx, Atoms are just objects with no essential information. They can have meta-information and relations to other objects but this information can not change the meaning of the atom. We can form the following typing judgements for atoms:

Atomic Reflexivity

\[\frac{a: Atom_\tau}{a =_\tau a}\]

What does it mean for it to be an atom in type tau? Do we want to say \(a:Atom \quad a:\tau\) or do we want to say that an object can be an atom in one type but not in another? ie. \(Nat = Zero | One | Two |Three...\) this would correctly make the statement \(Three:successor(Nat)\) Also, what does it mean for an element to be of type atom? Can we write \(3:\bullet\)? This would imply that the atom type itself has an infinite number of elements.

We can say that any two atoms of the same type that are definitionally equal to each other are equal to each other in the same type. This pattern is essential for reducing global complexity of code and its power will be more apparent when we introduce more complex types. If we have two developers working on the same concept without the knowledge of the other, this offers a mechanism to essentially merge the two concepts into one and ensure the correctness of all dependent code. We will start with a simple case of this for atoms and add more complex cases with more complex types.

Atomic Equality

\[\frac{a:Atom_\tau \quad b:Atom_\tau \quad a\equiv_\tau b}{a =_\tau b}\] An atom is an empty Lynx Object with no properties. Atoms form the leaves of Lynx Expression trees. For example, The number is zero is an atom in the type of natural numbers. How do lynx objects handle atoms and what do they look like? How do we formalize the transformation of a lynx expression into a lynx object?

Empty Type \((\bot)\)

Binary Sum Types \((+)\)

\[BinarySum ::= \{(element_1, a:\link{lynx-expression}{LynxExpression}),(element_2, b:\link{lynx-expression}{LynxExpression}), ...\}\]

While any concept in Lynx can be represented as an Atom, programs constructed only from atoms would lack any structure. Let's first look at an example of a sum type of atoms. If we have two atoms called \(True\) and \(False\) then we can define a type called \(Boolean\) defined as \(Boolean = True \:|\: False\). If instances of the type \(Atom\) are defined as having exactly one possible value then the binary sum of two atoms has \(1+1 = 2\) inhabitants. This is where the name sum type comes from.

Pattern Matching

If the natural numbers are defined as \(Nat = Zero | successor(Nat)\) then we can define addition of natural numbers as a function \[\begin{aligned}Add:(Nat, Nat)& \rightarrow Nat\\ Add(Zero, n)& = n \\Add (succ(x), y)&= Add(x, y.succ) \end{aligned}\] Coverage checking is ensuring that when subtyping is done by choosing an element of a sum type, then all elements of the sum type must be covered.

Are Sum types defined as pattern matching on attributes? (we can say \(zero : nat\) and successor(nat) : nat. nat = sumtype(zero, succ(nat)) zero:nat, one:nat one:succ(zero) these don't need to be minimal but they do need to be consistent...What is the implementation of this and of the function search? Sum types are expressed in terms of the isA relation. We will use the natural numbers as an example. We can say that \(zero\) isA naturalNumber and we can say that for any natural number \(n\), \(n\).successor is a natural number. As alluded to in the design section, Lynx has a mechanism for programers to generalize concepts over time and simplify the program

\(\tau_1 | \tau_2 | \tau_3 \neq \tau_1 | (\tau_2 | \tau_3)\) Exhaustiveness checking is different for these cases... but is the exhaustiveness checking then the same for the parenthesized type? Is there actually an isomorphism there?

Tagged Union Redundant Element Elimination

If the intersection of \(e_1\) and \(e_2\) is non-empty then

Tagged Union Redundant Element Coherence

If a type \(\tau\) is defined as a tagged union of two other elements \(e_1\) and \(e_2\) then for every element \(a \in e_1\) and \(b \in e_2\) in the intersection of \(e_1\) and \(e_2\), \(a=_\tau b\)

Binary Product Types \((\times)\)

\[BinaryProduct ::= \{(element_1, a:\link{lynx-expression}{LynxExpression}),(element_2, b:\link{lynx-expression} {LynxExpression}), ...\}\] Binary Product types are defined with essential attributes on lynx objects so the product type \[\frac{a:LynxExpression \quad b:LynxExpression}{a \times b : }\]

Atom Product Subtyping

Product types can be subtyped to be atoms with the rule: \[\frac{a:Atom \quad b:Atom}{a\times b:Atom}\] This intuitively makes sense because an atom has one possible value so the product of two atoms is \(1\times1 = 1\) so it also has one possible value.

Element Product Subtyping

\[\frac{a:\tau_1}{a\times\tau_2:\tau_1\times\tau_2 }\]

Binary Product Equality

\[\frac{a_0=_{\tau_0}a_1 \quad b_0=_{\tau_1}b_1}{a_0\times b_0=_{\tau_1\times\tau_2}a_1 \times b_1}\]

Binary Product Evaluation

\[\frac{e_0 \Downarrow_{\tau_0} v_0 \quad e_1 \Downarrow_{\tau_1} v_1}{e_1 \times e_2 \Downarrow_{\tau_0 \times \tau_1} v_0 \times v_1}\]

Function Types \((\rightarrow)\)

\[Function ::= \{(element_1, a:\link{lynx-expression}{LynxExpression}),(element_2, b:\link{lynx-expression} {LynxExpression}), ...\}\] there should be a way to search for attributes. If we have an object \(a:A\) then for any function \(f:A \rightarrow B\) there should be an attribute-value pair \((f,b)\) on any Lynx Object of type \(A\) where \(b:B\)

Partial function interpretation

Partial functions are denoted with the \( \partialfunction \) symbol.

Dependent Product Types \((\Pi)\)

\[\prod_{x:A} B(x)\] Dependent product types are a generalization of functions and Product types The Binary Product type is a subtype of the Dependent Product type by restricting \(A=0|1\), \(B(0)= \tau_0\), \(B(1)= \tau_1\) then \[\prod_{x:(0|1)} B(x) =_\Pi \tau_0 \times \tau_1\] As a generalization of product types: If we want to make the product type handle more than two inputs then we can get the finitary product \(\tau_0 \times \tau_1 \times \tau_2... \times \tau_i\) by restricting an index variable I to be a range of natural numbers \(I = [0, i] = \{0, 1, 2, ...i\}\) then \[\tau_1 \times \tau_2... \times \tau_i =_\Pi \prod_{[0, i]} \tau_i\]

Isomorphism with attributes

Predicates like nonzero denominator for division, uint32 operators being defined in the range \(0-2^{32}\) Parameterized attributes like array indexing-- How is this related to currying? Matrix indexing is just a family of functions ranging over (Nat, Nat) pairs

Dependent Sum Types \((\Sigma)\)

\[\sum_{x:A} B(x)\] In a similar pattern to the product type, the Binary Sum type is a subtype of the Dependent Sum type by restricting \(A=0|1\), \(B(0)= \tau_0\), \(B(1)= \tau_1\) then \[\sum_{x:(0|1)} B(x) =_\Sigma \tau_0 + \tau_1\] The Dependent sum type is also a generalization of the binary product type: \[A \times B = \underbrace{B + B + \cdots + B}_{A}\]

Subset Types

For example, if we have

Quotient Types

Equality Types \((=)\)

subsingleton type. It is atom(unit) if inhabited and void if not inhabited. Path types, Id types We want to have two types of equality that are both type dependent: definitional equality and a form of evidence based equality.

Equality Introduction

\[\frac{\begin{aligned}a&:A \\ b&:B \\A, B&:\tau\\x&:A \rightarrow B \\ y&:B \rightarrow A\\ x.y &=_\rightarrow id(a) \\ y.x &=_\rightarrow id(b)\end{aligned}}{a=_\tau b}\] Notice that to generate equality of types, in the last two we rely on equality of dependent functions.

Coercion?

\[\frac{a=_{\tau_0} b \quad a.x:\tau_1 \quad b.x:\tau_1}{a.x \mapsto_{\tau_1}}\]

Equality Consistency (capping?)

\[\frac{a=_{\tau_0} b \quad a.x:\tau_1 \quad b.x:\tau_1}{a.x=_{\tau_1} b.x}\]

Predicated Equality

Partial Equality

Examples

Equality of Natural Numbers with Unsigned Ints

We can define the Natural Numbers as \(Nat = Zero | successor(Nat)\). The natural numbers can also be represented as a non-empty list of binary digits. \(BinNat = Zero | One | (Zero | One, BinNat)\). Representing numbers in the unary case is useful for proving theorems by induction, and representing numbers in binary is useful for computation so, ideally, we would be able to transport between the two representations while knowing that nothing can break. Remember that the definition we want for equal types is "we can replace one equal type for another and the specified behavior will not change" so we want to show that \(Nat =_{Nat} BinNat\). We can start by identifying zero with the Atomic Reflexivity rule: \[\frac{Zero:Atom_{Nat}}{Zero =_{Nat}Zero}\] and one with Atomic Equality: \[\frac{One:Atom_{Nat} \quad sucessor(zero):Atom_{Nat} \quad One \equiv_{Nat} successor(Zero)}{One =_{Nat} successor(Zero)}\]

Types Through Examples

When we are programming is is useful to start defining types as the simple case and then creating a more general version later. This process of generalizations often gets programmers in trouble where they used a specific version of a type somewhere in their code and the new generalization breaks it. let's start with a very basic example. We can create a new object called "zero" and say it is an element of the natural numbers by adding a subtype (isA) relation between it and the "natural number" object.

Now we can say "one is also a natural number" so we add the definition \(\text{One}: \text{Nat}\) and \(\text{Two}:\text{Nat}\)

Type Theory

To understand the lynx type system we first need some background in traditional type theory. Unlike types for languages like C++ or Java, types in lynx have a more formal structure. This chapter will be written without the expectation that you have any background in formal type theories and should be accessible to someone familiar with programming in dynamically typed languages like python or javascript. If you already know type theory you can skip this section and the section on Algebraic Data Types.

We will use construction of a coordinate point as a motivating example throughout this chapter. A coordinate point consists of an x-coordinate and a y coordinate and can be written as a pair (x,y). Notice that this contains two pieces of information: an x-coordinate that is a number and a y-coordinate that is a number. We will use the notation \(x: \text{Number}\) (read "\(x\) is of type number") as a shorthand. With this notation we can also say, \(y: \text{Number}\) and \((x, y):\text{CoordinatePoint}\).

Formal type theory has different kinds of rules that describe how types are formed and used. We will introduce 3 types of rules: introduction rules, elimination rules and computation rules.

The Introduction rule describes how a type is constructed
The Elimination rule describes how a type that is already constructed can be broken back apart into its pieces
The Computation rule checks that the introduction rules match with the elimination rules and creates a way to "run" these types as programs

Introduction Rules

Given we have something called x that is a number and we have something called y that is a number we can create a coordinate point \((x, y)\). The notation we use for this in type theory is: \[\frac{x:\text{Number} \quad y:\text{Number}}{(x, y):\text{CoordinatePoint}}\] The text above the line are the assumptions required for the text below the line to be true. So to create a coordinate point we need a number \(x\) and a number \(y\). Assuming we have both of these we can create a coordinate point.

Elimination Rules

Elimination rules are sort of the inverse of introduction rules. They ask "given that we already have a coordinate point, what information can we pull out of it?" In this case there two elimination rules that we'll call \(E_1\) and \(E_2\):\[\begin{align} \frac{(x, y):\text{CoordinatePoint}}{x:\text{Number}}(E_1) && \frac{(x, y):\text{CoordinatePoint}}{ y:\text{Number}}\left(E_2\right)\end{align}\] \(E_1\) says that we if we have a coordinate point we can extract an x coordinate that is of type number and \(E_2\) says that we can also extract a y-coordinate of type number. Notice that these elimination rules are a sort of inverse to the introduction rule. The introduction rule shows how to put pieces together and the elimination rule shows how to take them back apart. This concept will be made more formal with the computation rule.If all of this seems too basic to you, consider this: Even though these rules are simple and may seem trivial, they form a structure of types that can be checked by a computer. As types get more complicated, the computer can check that the programmer has handled every case and that the types fit together. The mechanization this allows is the important part

Computation Rules

Concrete Types

Type Equality

Type systems are used to improve the correctness of code by preventing programmers from using pre-built abstractions in a way they were not designed for. The big question we should be asking for determining if two types are equal is "Can these two types be used interchangeably without breaking anything?" Notice that this question is about behavior of the types not the structure of the types. Most formal type systems including ADTs focus on comparing structures but two types can have the same structure and not be interchangeable without more information or they can be interchangeable but have different structure. Let's go back to the example of a coordinate point to look at interchangeability.

2D coordinate points can be represented as cartesian or polar coordinates.

Both of these coordinate points are pairs of numbers so they have the same algebraic data type but with ADTs there isn't any information about how to actually interchange between the two types. Let's define two functions:

\(\text{CartesianToPolar}: \text{CartesianPoint} \rightarrow \text{PolarPoint} \)
\(\text{CartesianToPolar} = (x,y) \rightarrow (\arctan(y/x), \sqrt{x^2+y^2})\)
\(\text{PolarToCartesian}: \text{PolarPoint} \rightarrow \text{CartesianPoint} \)
\(\text{PolarToCartesian} = (r,\theta) \rightarrow (r\cos(\theta),r\sin(\theta))\)

These functions can be used to convert between the two types of coordinate points.

This is where the concept of interfaces in languages like C++ leaves the correctness of these functions to the programmer. Interfaces make no guarantees that the types are actually interchangeable.

To ensure that the two types are equal we need to make sure that for every element of a type we can not only convert it to the equivalent type but we can convert it back to the same element. This forces the conversion function to work for every edge case. For example, if we want to convert the polar coordinate point \((1, 450^\circ)\)... To show this we need the identity functions or both polar and cartesian coordinate points:

\(\text{id}_{\text{CartesianPoint}}: \text{CartesianPoint} \rightarrow \text{CartesianPoint} \)
\(\text{id}_{\text{CartesianPoint}} = (x, y) \rightarrow \left(\text{id}_{\text{number}}(x), \text{id}_{\text{number}}(y)\right) \)
\(\text{id}_{\text{PolarPoint}}: \text{PolarPoint} \rightarrow \text{PolarPoint} \)
\(\text{id}_{\text{PolarPoint}} = (r, \theta) \rightarrow \left(\text{id}_{\text{number}}(r), \text{id}_{\text{number}}(\theta)\right) \)

Next we need to show that each identity function is equal to the corresponding composition of the coercion functions.

\(\text{id}_{\text{CartesianPoint}} =_\rightarrow \text{CartesianToPolar}\circ\text{PolarToCartesian} \)
\(\text{id}_{\text{PolarPoint}} =_\rightarrow \text{PolarToCartesian}\circ\text{CartesianToPolar} \)

\(\text{PolarToCatresian}(\text{CartesianToPolar}(x,y)) =_{cp} \left(\sqrt{x^2+y^2}\cos(\arctan\left(\frac{y}{x}\right), \sqrt{x^2+y^2}\sin(\arctan(\frac{y}{x})\right) =_{cp} (x, y) \)

Use type indexed family of types equality example too. ie. \(seq(if(n_1, n_2)(a)) =_{seq(Nat)}if(seq(n_1), seq(n_2))(a)\) where \(n_1:Nat, n_2:Nat, a:Bool\)

Canonicity

Transport

Inheritance

Subsumption Principle

Equality Relations

Lynx has a richer notion of equality of elements than other languages. It is built on top of lynx expressions in the form of equality relations.

Atomic Equality \[\frac{v:\tau}{v=_\tau v}\] Expression Equality \[\frac{e_1\Downarrow_\tau v_1 \quad e_2\Downarrow_\tau v_2 \quad v_1\equiv_\tau v_2}{e_1 =_\tau e_2} \] Product Type Equality \[\frac{e_1 =_\tau e_2 \quad e_3 =_\sigma e_4}{e_1 \times e_3 =_{\tau \times \sigma} e_2 \times e_4}\] Sum Type Equality \[\frac{}{}\] Use two representations of the natural numbers as an example. We want to be able to prove that the binary representation of the natural numbers is equal to the unary representation Function Equality One approach to determining equality of functions is called Function Extensionality. Under this definition, two functions \(f: A\rightarrow B\) and \(g: A \rightarrow B\) are equivalent iff for every \(a:A\) \(f(a) =_B g(a)\) Reflexivity \[\frac{e:\tau}{e =_\tau e}\]

Part III – Implementation

One of the goals of lynx is to separate the specification of a program from its implementation. The specification and implementation of the language itself is no different. Lynx is (eventually) written in itself so the underlying implementation of the language can be swapped out without affecting its behavior.

Primitives

The most basic way the implementation and the specification are separated is with the concept of primitives. A specification for a basic building block of computation serves as the denotational semantics for an underlying language feature that does the actual computation. For example, we could define a type in lynx for 32 bit unsigned integers as a list of binary digits with length 32. We can then define a WebAssembly uint32 as a primitive for this type. The primitive states without proof (but with tests) that the specification for a uint32 has a one-to-one correspondence with a WebAssembly uint32. The compiler can then decide to use the WebAssembly uint32 as an optimization. In practice, this is how all computation in Lynx is done.

Persistence

Each object in lynx is identified with a hash. Informally, the hash of an object is calculated by recursively hashing all of the child nodes of the current object. Let \(H: \text{LynxObject} \rightarrow \{0,1\}^n\) be a cryptographic hash function The hash of a node is just the result of hashing its children together to form an unbalanced Merkle tree. Starting at the bottom of the tree we apply a the hash function \(H\) to the values 20, 30 and "text" respectively. Prepend leaf nodes with a 0x00 byte and internal nodes with a 0x01 byte to prevent a second preimage attack.