Lynx is a computable specification language designed to make building great software more intuitive and efficient. Programming is fundamentally about human communication. After a brief discussion about the philosophy behind Lynx and what a programming ecosystem should look like, this document will detail the formal specification of the Lynx language.
When most people think about programming they think about writing code. Instead, it should be thought of in a more abstract way as the act of communicating complex ideas precisely. With traditional programming languages, programmers are really doing two separate acts at the same time: Writing code to communicate with computers and writing documentation to communicate with other programmers. These two parts are held together with an informal set of best practices that are but they should be unified within a programming environment.
Traditional text-based coding is the standard that almost every general purpose programming language uses because of its expressive power and flexibility. It provides a consistent interface for every program and the text that is visible is the entirety of the implementation. The few graphical programming languages that exist fall short because they try to replicate the semantics of text-based code but in graphical form. Lynx takes a different approach.
why graphical is good-- edit locality don't need to look through source code for where something is implemented.
implementation agnostic
Functional programming focuses on a single type of relations between objects: How an object is evaluated. While this is is an interesting property of an object, and probably the most useful from a computational perspective, it makes programming in a functional style difficult. The solution is to add a generalization of relations between objects where the concept of evaluation is only one of them.
Lynx offers features for determining how the result of a program was defined built into the language. Say a programmer is implementing a collision detection algorithm but it isn't working properly...introspect into how the collision boolean was defined instead fo stepping forward through imperative code to the point of the error When combined with graphical representations of each of these concepts this forms an extremely powerful debugging tool
Keep track of definitions. In most languages, computation is uni-directional. Lynx keeps track of where values come from so that it can be audited at a later time. This feature creates an extremely powerful tool for debugging programs. Instead of a programmer needing to understand the whole structure of a program and work forward to a bug, theycan see where there is an incorrect value and then work backwards. Keeping track of definitions also allows for more intuitive update patterns where the programmer doesn't have to explicitly trigger an update. This improves the separation between specification and implementation, prevents bugs, and improves possibilities for optimization.
Sum types --good for checking where a program must be updated
The correctness of lynx programs is defined by a single guiding principle: All information justifying the correctness of a program should be part of the program. This includes documentation, implied information in the code and detailed information about the compiler.
When designing a line intersection algorithm, the programmers should be able to see the lines and not have it hidden. if zipping two lists then there should be some justification that they are the same length and this justification should be part of the program When using floating point numbers, the programmer should know specify that the final result will be within certain error bounds.With lynx we introduce a new programming paradigm called the object-context paradigm. The object oriented style matches the way people think about concepts in the real world more closely than functional or imperative programming styles. This is one reason for the relative popularity of object oriented programming. Lynx uses a carefully constrained form of Object Oriented programming so that programs have a more intuitive understanding while still allowing for the more strict program analysis like in a functional paradigm. Lynx can also be thought of as a purely declarative form of object oriented programming.
Objects in Lynx consist of a mapping between attributes and values. Both attributes and values are also objects. One way to represent objects graphically is with a tree diagram where the objects are nodes and the attributes are edges.
Objects can have multiple attributes that are represented as different branches of the tree. and because values are also objects the tree can be multiple levels deep. This forms a recursive data type that can be used to represent any other data type in lynx.
We will use text being drawn on a screen as an example. The text can have several attributes like its position on the screen, the string to be printed, font type, font size etc. We can also attach metadata like the name of the textbox as an attribute.
Every tree or subtree in this diagram is an object. So far, this is just describes a static graphical layout. Lynx is able to do computation when we introduce context and references.
We can describe a text object (text box) whose position is defined by the graphical element it is put inside. We will call the "parentElement" of the text box "group" and then we will put the group element inside of a window.
This can be represented by the lynx object tree:
In this case the text object (text and everything below it) is placed in the context of the group. The text object has two references that refer to values outside of the text object. The reference "group.position.x" refers to the the x component of the position of the group object. The text object will always behave the same way if it is placed in an identical "group" context. We can think of an object as a pure function and the references to the context it is in corresponds to the arguments of that pure function. So now if we change the position of the "group" element then the text will follow and always be offset by \((20, 20)\)
We've now made these definitions dynamic but with this system "text" is not really isolated from "group" because the references still refer to "group" by name. It would be nicer if we could say that the position of "text" is relative to whatever it's parent element is. To solve this problem we can define a concept called an inverse attribute. In this case "parentElement" is the inverse attribute of "childElement"
With inverse attributes, the definition of "text" depends on the context it is in. Its parentElement can have any name but it must have a position with x and y attributes. We can again think of text as a pure function with these references as arguments.
This forms the core part of the lynx language. Next, we will look at how these object graphs can change over time in response to new data and store state.
So far, we have discussed how static lynx objects are structured and evaluated. They can also change in response to outside events and store data. We'll start with a simple, non-stateful example where the position of the text box is set to be the position of the user's mouse.
For each of these sections of the type hierarchy we will create a specification for each object in two steps:
Types are used to constrain the behavior of programs and, when they are well-formed, ensure correctness within the boundaries of the type system. In some dynamically typed languages like javascript or python the typing is considered "weak" because there are very few rules about which objects can be used in which places. This is often seen as a feature because it is less confusing for beginners writing simple programs.
Lynx takes the opposite approach. Type structures can always be relaxed selectively but when there is no type structure then the ability to reason about programs is lost. One way of thinking about it is that a weakly typed language can be created in a strongly typed language by purposefully relaxing all of the rules but the rules but a strongly typed language can not be replicated in a weakly typed language so the strongly typed language must be more expressive.
In most languages, abstract types like sum types, product types and function types are first class concepts but in lynx they are subtypes of Lynx Expressions. The meaning of the abstract types is given by their behavior upon evaluation. We will start with the simplest abstract types and then work towards the most general abstract types. A brief note on notation. \(a:A\) means a is a subtype of A.
What does it mean for it to be an atom in type tau? Do we want to say \(a:Atom \quad a:\tau\) or do we want to say that an object can be an atom in one type but not in another? ie. \(Nat = Zero | One | Two |Three...\) this would correctly make the statement \(Three:successor(Nat)\) Also, what does it mean for an element to be of type atom? Can we write \(3:\bullet\)? This would imply that the atom type itself has an infinite number of elements.
We can say that any two atoms of the same type that are definitionally equal to each other are equal to each other in the same type. This pattern is essential for reducing global complexity of code and its power will be more apparent when we introduce more complex types. If we have two developers working on the same concept without the knowledge of the other, this offers a mechanism to essentially merge the two concepts into one and ensure the correctness of all dependent code. We will start with a simple case of this for atoms and add more complex cases with more complex types.
While any concept in Lynx can be represented as an Atom, programs constructed only from atoms would lack any structure. Let's first look at an example of a sum type of atoms. If we have two atoms called \(True\) and \(False\) then we can define a type called \(Boolean\) defined as \(Boolean = True \:|\: False\). If instances of the type \(Atom\) are defined as having exactly one possible value then the binary sum of two atoms has \(1+1 = 2\) inhabitants. This is where the name sum type comes from.
If the natural numbers are defined as \(Nat = Zero | successor(Nat)\) then we can define addition of natural numbers as a function \[\begin{aligned}Add:(Nat, Nat)& \rightarrow Nat\\ Add(Zero, n)& = n \\Add (succ(x), y)&= Add(x, y.succ) \end{aligned}\] Coverage checking is ensuring that when subtyping is done by choosing an element of a sum type, then all elements of the sum type must be covered.
Are Sum types defined as pattern matching on attributes? (we can say \(zero : nat\) and successor(nat) : nat. nat = sumtype(zero, succ(nat)) zero:nat, one:nat one:succ(zero) these don't need to be minimal but they do need to be consistent...What is the implementation of this and of the function search? Sum types are expressed in terms of the isA relation. We will use the natural numbers as an example. We can say that \(zero\) isA naturalNumber and we can say that for any natural number \(n\), \(n\).successor is a natural number. As alluded to in the design section, Lynx has a mechanism for programers to generalize concepts over time and simplify the program
\(\tau_1 | \tau_2 | \tau_3 \neq \tau_1 | (\tau_2 | \tau_3)\) Exhaustiveness checking is different for these cases... but is the exhaustiveness checking then the same for the parenthesized type? Is there actually an isomorphism there?When we are programming is is useful to start defining types as the simple case and then creating a more general version later. This process of generalizations often gets programmers in trouble where they used a specific version of a type somewhere in their code and the new generalization breaks it. let's start with a very basic example. We can create a new object called "zero" and say it is an element of the natural numbers by adding a subtype (isA) relation between it and the "natural number" object.
Now we can say "one is also a natural number" so we add the definition \(\text{One}: \text{Nat}\) and \(\text{Two}:\text{Nat}\)
We will use construction of a coordinate point as a motivating example throughout this chapter. A coordinate point consists of an x-coordinate and a y coordinate and can be written as a pair (x,y). Notice that this contains two pieces of information: an x-coordinate that is a number and a y-coordinate that is a number. We will use the notation \(x: \text{Number}\) (read "\(x\) is of type number") as a shorthand. With this notation we can also say, \(y: \text{Number}\) and \((x, y):\text{CoordinatePoint}\).
Formal type theory has different kinds of rules that describe how types are formed and used. We will introduce 3 types of rules: introduction rules, elimination rules and computation rules.
Given we have something called x that is a number and we have something called y that is a number we can create a coordinate point \((x, y)\). The notation we use for this in type theory is: \[\frac{x:\text{Number} \quad y:\text{Number}}{(x, y):\text{CoordinatePoint}}\] The text above the line are the assumptions required for the text below the line to be true. So to create a coordinate point we need a number \(x\) and a number \(y\). Assuming we have both of these we can create a coordinate point.
Elimination rules are sort of the inverse of introduction rules. They ask "given that we already have a coordinate point, what information can we pull out of it?" In this case there two elimination rules that we'll call \(E_1\) and \(E_2\):\[\begin{align} \frac{(x, y):\text{CoordinatePoint}}{x:\text{Number}}(E_1) && \frac{(x, y):\text{CoordinatePoint}}{ y:\text{Number}}\left(E_2\right)\end{align}\] \(E_1\) says that we if we have a coordinate point we can extract an x coordinate that is of type number and \(E_2\) says that we can also extract a y-coordinate of type number. Notice that these elimination rules are a sort of inverse to the introduction rule. The introduction rule shows how to put pieces together and the elimination rule shows how to take them back apart. This concept will be made more formal with the computation rule.If all of this seems too basic to you, consider this: Even though these rules are simple and may seem trivial, they form a structure of types that can be checked by a computer. As types get more complicated, the computer can check that the programmer has handled every case and that the types fit together. The mechanization this allows is the important part
Type systems are used to improve the correctness of code by preventing programmers from using pre-built abstractions in a way they were not designed for. The big question we should be asking for determining if two types are equal is "Can these two types be used interchangeably without breaking anything?" Notice that this question is about behavior of the types not the structure of the types. Most formal type systems including ADTs focus on comparing structures but two types can have the same structure and not be interchangeable without more information or they can be interchangeable but have different structure. Let's go back to the example of a coordinate point to look at interchangeability.
2D coordinate points can be represented as cartesian or polar coordinates.
Both of these coordinate points are pairs of numbers so they have the same algebraic data type but with ADTs there isn't any information about how to actually interchange between the two types. Let's define two functions:
This is where the concept of interfaces in languages like C++ leaves the correctness of these functions to the programmer. Interfaces make no guarantees that the types are actually interchangeable.
To ensure that the two types are equal we need to make sure that for every element of a type we can not only convert it to the equivalent type but we can convert it back to the same element. This forces the conversion function to work for every edge case. For example, if we want to convert the polar coordinate point \((1, 450^\circ)\)... To show this we need the identity functions or both polar and cartesian coordinate points:
Next we need to show that each identity function is equal to the corresponding composition of the coercion functions.
Lynx has a richer notion of equality of elements than other languages. It is built on top of lynx expressions in the form of equality relations.
Atomic Equality \[\frac{v:\tau}{v=_\tau v}\] Expression Equality \[\frac{e_1\Downarrow_\tau v_1 \quad e_2\Downarrow_\tau v_2 \quad v_1\equiv_\tau v_2}{e_1 =_\tau e_2} \] Product Type Equality \[\frac{e_1 =_\tau e_2 \quad e_3 =_\sigma e_4}{e_1 \times e_3 =_{\tau \times \sigma} e_2 \times e_4}\] Sum Type Equality \[\frac{}{}\] Use two representations of the natural numbers as an example. We want to be able to prove that the binary representation of the natural numbers is equal to the unary representation Function Equality One approach to determining equality of functions is called Function Extensionality. Under this definition, two functions \(f: A\rightarrow B\) and \(g: A \rightarrow B\) are equivalent iff for every \(a:A\) \(f(a) =_B g(a)\) Reflexivity \[\frac{e:\tau}{e =_\tau e}\]One of the goals of lynx is to separate the specification of a program from its implementation. The specification and implementation of the language itself is no different. Lynx is (eventually) written in itself so the underlying implementation of the language can be swapped out without affecting its behavior.
The most basic way the implementation and the specification are separated is with the concept of primitives. A specification for a basic building block of computation serves as the denotational semantics for an underlying language feature that does the actual computation. For example, we could define a type in lynx for 32 bit unsigned integers as a list of binary digits with length 32. We can then define a WebAssembly uint32 as a primitive for this type. The primitive states without proof (but with tests) that the specification for a uint32 has a one-to-one correspondence with a WebAssembly uint32. The compiler can then decide to use the WebAssembly uint32 as an optimization. In practice, this is how all computation in Lynx is done.
Each object in lynx is identified with a hash. Informally, the hash of an object is calculated by recursively hashing all of the child nodes of the current object. Let \(H: \text{LynxObject} \rightarrow \{0,1\}^n\) be a cryptographic hash function The hash of a node is just the result of hashing its children together to form an unbalanced Merkle tree. Starting at the bottom of the tree we apply a the hash function \(H\) to the values 20, 30 and "text" respectively. Prepend leaf nodes with a 0x00 byte and internal nodes with a 0x01 byte to prevent a second preimage attack.