SMT-LIB Version 3.0 - Preliminary Proposal

Last updated: 2021-12-31

Overview

This page contains a preliminary high-level proposal for SMT-LIB Version 3. The proposal is still being worked out, and this document is an incomplete description of the new format. A reference document is under construction and will contain a description of the proposal in full detail. This page focuses on the most salient aspects of the proposal, for a quicker overview. More content will be added with time as needed.

Preamble

The great majority of the changes described below will affect the way theories and logics are defined. It will not affect scripts that rely on (the equivalent of) SMT-LIB 2.6 logics. This means that most of the features of the Version 3 will not have to be supported by current SMT solvers. Support for some of new features of SMT-LIB 3 can be introduced gradually over time in a solver as a consequence of deciding to support new theories and logics that rely on those features.

New underlying logic

The main new aspect of the proposed new version is the move from (an extension of) many-sorted first-order logic as the underlying logic of SMT-LIB to a higher-order logic with polymorphism and dependent types but classical semantics. An important aspect of this change is that the new language for SMT-LIB scripts strives to be as backward-compatible as possible to that of Version 2.6. This is achieved in two ways:

  1. Giving new meaning to old syntax as needed.
  2. Defining the formal semantics so that it is essentially the same as the old one over the old syntax.
For example, now sorts from SMT-LIB 2.6 are interpreted as simple types, with parametric sorts interpreted as polymorphic types. Sorted constant and function symbols are interpreted as typed constants, with function symbols of arity > 1 becoming higher-order constants, Curry-style. Index symbols, as in (_ extract i j) or (_ BitVec 4), are now seen as syntactic sugar for symbols with a dependent type/kind. As in SMT-LIB 2, a number of additional syntactic restrictions on scripts are imposed to obtain various fragments of interests or logics, in SMT-LIB 2 terminology.

The underlying logic of SMT-LIB 3 has many elements of the Calculus of Inductive Constructions (CIC) used by proof assistants like Coq and Lean, with the main restriction of allowing only rank-1 polymorphism (actually, let-polymorphism) and not having a Prop-like kind for (constructive) propositions. So, contrary to CIC, formulas are not types and instead continue to be expressed as terms of a two-element Bool type; this is similar to the logic of the PVS proof assistant or proof assistants of the HOL family. Differently from those provers, types can depend both on other types (i.e., be polymorphic) and on values. Consistently with CIC, polymorphism is achieved by allowing functions to take as arguments not just values but also types, with the restriction, however, that all type arguments must come before value arguments which, in turn, cannot have a polymorphic type (to satisfy the restriction to rank-1 polymorphism).

Motivation

The main motivation for the move to a CIC-like logic is manyfold.

  1. First, using a more powerful logic as the underlying logic considerably simplifies the design and the formal foundations of the SMT-LIB standard. It obviates the need for most of the ad-hoc extensions to many-sorted first-order logic we had to introduce in Version 2 while also providing enough expressive power to define most theories formally, as opposed to textually as is done in SMT-LIB 2.6.
  2. Second, it allows the definition of a single language to define theories, logics, and benchmarks.
  3. Third, it opens the possibility for SMT solvers to provide support for some level of higher-order reasoning, facilitating their integration as automated reasoning back ends into higher-order theorem provers and into proof assistants, which are also based on a higher-order logic. Current integrations rely on complex encodings from the higher-order logics supported by these tools to the logic of SMT-LIB 2.6. Moving to a higher-order logic will dramatically simplify these encodings, improving the trustworthiness of the integration. Natively reasoning with higher-order constructs (e.g., via higher-order equational reasoning) by the SMT solver could also considerably improve solving times on goals coming from these tools.
  4. Fourth, it opens the way to the introduction of a number of new SMT-LIB theories (for sets, relations, database tables, sequences) which benefit from the availability of second-order functions such as as fold, map, filter and so on, both to define common operations and to reason about them.

We stress that the move to the new logic maintains backward compatibility with SMT-LIB 2.6 to a very large extent. In particular, it will affect SMT-LIB 2.6-compliant solvers in a minimal way if their developers choose not to support the new features.

The core language

Terms in this logic are built out of variables, constants, applications, Π-abstractions, and λ-abstractions, as in the CIC calculus. There are three classes of (well-formed) terms in the language:

  • terms denoting values, such as 3, (+ x 3), (lambda ((x Int)) (+ x 3));
  • terms denoting types, such as Bool, Int, (-> Int Real), (Array Int Bool), (List Int), (BitVec 3);
  • terms denoting kinds, such as Type, (-> Int Type), (-> Type Type).
The symbol -> will denote the function type constructor. We will write it as in this document from now on, for readability.

Well-formed value terms have an associated type term (or, simply, type). Well-formed type terms have an associated kind term (or, simply, kind). For instance,

  • 3 is a constant of type Int, which has kind Type;
  • not is a constant of type (→ Bool Bool), which has kind Type;
  • List is a type constructor of kind (→ Type Type);
  • BitVec is a type constructor of kind (→ Int Type).
Note that is both a type and a kind constructor. The syntax (→ α β) can be understood as an abbreviation of the CIC type or kind Π x:α β. The constructor can be used as a multiarity right-associative symbol. This allows one, for example, to write the type (→ A (→ B C)) as (→ A B C), and the kind (→ Type (→ Int Type)) as (→ Type Int Type). Function symbols of rank (σ₁ ⋅⋅⋅ σᵢ) in Version 2.6 become constants of type (→ σ₁ ⋅⋅⋅ σᵢ) in Version 3. This means that function symbols with more than one argument become higher-order functions in Version 3. For instance, the integer addition operator + now has type (→ Int Int Int), that is, (→ Int (→ Int Int)).

Dependent Types

In Version 3, functions that have a dependent type τ take as extra arguments also the types or value that τ depends on. For instance, bit vector functions take as argument also the size of the bit vector. For conciseness and backward compatibility, however, such arguments are declared as implicit any time they can be inferred from later arguments. Concretely, dependent function types are expressed with the syntax illustrated in this example:

  (→ (! Int :var n) (BitVec n) (BitVec (+ n 1)))
This expression denotes the type of a function that takes as input an integer n and returns a function that takes a bit vector of size n, and returns a bit vector of size n + 1. Note that (! Int :var n) uses the same term annotation syntax as in Version 2 but now applied to types. The new predefined :var attribute annotating Int in this case provides a name (n) for the first input. The general syntax for dependent types is
  (→ (! A :var x) B)
where x is a bound variable whose scope is B. An alternative syntax, more reminiscent of that of dependent type logics might have been:
(Pi (x A) B)
or
(Forall (x A) B)
which perhaps makes it clearer that x is a variable bound by the binder Pi/Forall in the scope B. However, the annotation based syntax is possibly more legible and, more importantly, more flexible because it allows the annotation of function inputs with further attributes. In particular, it allows one to declare an input argument as implicit. This is done with the new valueless attribute :implicit that can be associated to a function argument. For example, :
  (→ (! Int :var n :implicit) (BitVec n) (BitVec n))
makes the first argument implicit — in the sense that it is not needed in applications of functions of that type. This is sensible since the value of the first argument can be inferred from the type of the second argument. A further attribute, :restrict, permits the imposition of semantic restrictions on input and output values of a function, something that, in its full generality, effectively allows the definition of PVS-style predicate subtypes (aka, refinement types). For instance, in type
  (→ (! Int :var n :implicit :restrict (> n 0)) (BitVec n) (BitVec n))
the implicit argument n is required to be a positive integer. In PVS syntax, this would be the dependent type
  [n: {m : Int | m > 0} → BitVec(n) → BitVec(n)]
This notation can also express relational constraints on the input and output types, as for instance, in
  (→ (! Int :var m :implicit :restrict (> m 0)) 
     (! Int :var n :implicit :restrict (> n m)) 
     (BitVec m) (BitVec n) (BitVec (- n m))) 
Yet another attribute, :syntax, allows the introduction of syntactic restrictions on input terms:
  (→ (! Int :var n :implicit :restrict (> n 0) :syntax <numeral>) 
     (BitVec n) (BitVec n))
The additional restriction :syntax <numeral> specifies that the only permitted applications of functions of the type above are those where n is a concrete constant (1, 2, ...) and not a symbolic term (x, (+ x 1), ...). This is convenient, for instance, when defining the current SMT-LIB 2 logics with bit vectors, where bit vector sizes cannot be symbolic. The language includes constructs to define syntactic categories like <numeral> (see later).

Polymorphic Types

Polymorphic functions are now expressed as having input types that depend on types provided as input, eliminating the need of the par binder from Version 2.6. For example, the array select function now has type:

  (→ (! Type :var I :implicit) (! Type :var E :implicit) (Array I E) I E) 
where the first two, implicit arguments I and E are types. (Type is the kind of all types.) This style applies to user-defined polymorphic algebraic datatypes as well. For instance, the empty list constant nil and the list constructor constant cons would now have type
  (→ (! Type :var X) (List X))   ; type of nil
and
  (→ (! Type :var X :implicit) X (List X) (List X))   ; type of cons
respectively. Note that the type parameter for nil is not implicit in this case, which requires one to write (nil Int) for the empty list of integers, for instance. This solution eliminates the ambiguity problem for polymorphic symbols in Version 2.6 where nil would be an overloaded nullary symbol, making the application of the ascription operator as (i.e., (as nil (List Int))), which is mandatory in SMT-LIB 2.6, unnecessary in this case.

Terms

A new core binder in terms is now the lambda binder whose general form is

  (lambda ((x T)) t)
where t is a value term and x is a bound variable of type or kind T with scope t. This allows the construction of lambda abstractions such as
  (lambda ((A Type)) (lambda ((x A)) (lambda ((y A)) (not (= x y)))))
or, more concisely, (lambda ((A Type) (x A) (y A)) (not (= x y))).

Since a constant f of type (→ σ₁ ⋅⋅⋅ σᵢ) is a higher-order function when i > 2, partial applications of such constants, e.g., (f t), are meaningful and legal. The syntax (f t₁ ⋅⋅⋅ tᵢ) for the full application of f remains the same as in Version 2.6 although it is now seen as an abbreviation of ( ⋅⋅⋅ ((f t₁) t₂) ⋅⋅⋅ tᵢ).

As in Version 2, formulas are terms of type Bool. Predefined constants include

  • the Boolean constants true and false;
  • the equality operator =, now a polymorphic constant of type (→ (! Type :var A :implicit) A A Bool)));
  • the ite operator, now with type (→ (! Type :var A :implicit) Bool A A A))).

A core language of commands allows one to declare new type constructors and constants, and to assert formulas.

; Example
(declare-type A ())   ; declares a new simple type (a nullary type constructor) 
(declare-type B ()) 
(declare-const a A)   ; declares a constant of type A
(declare-const f (→ A B))   ; declares a constant of type (→ A B)
(declare-const g (→ A B))
(declare-const p (→ A B Bool))
(assert (= b (f a))) 
(assert (= f g))                                   ; higher-order assertion
(assert (= p (lambda ((x A) (y B)) (= (g x) y))))  ; higher-order assertion

Constructors for polymorphic/dependent types can be declared as in the examples below:

  (declare-type Collection (Type))
  (declare-type Pair (Type Type)) 
  (declare-type BitVec (Int)) 
  (declare-type Vector (Type Int)) 
Type constructor Collection has kind (→ Type Type). As an example, the type
  (→ (! Type :var X) (Collection X))
built using this constructor is polymorphic; more precisely, it is a dependent type with one type parameter: X .
Type constructor Pair has kind (→ Type Type Type). The type
  (→ (! Type :var A) (! Type :var B) (Pair A B))
is a dependent type with two type parameters: A and B.
Type constructor BitVec has kind (→ Int Type). The type
  (→ (! Int :var n) (BitVec n))
is a dependent type with one value parameter: X.
Type constructor Vector has kind (→ Type Int Type). The type
  (→ (! Type :var X) (! Int :var n) (Vector X n))
is a dependent type with one type parameter and one value parameter: X and n, respectively.

Note that since declare-type is essentially a constant declaration but the at level of kinds, the command (declare-type Vector (Type Int)), say, could be understood as an abbreviation of (declare-const Vector (→ Type Int Type)), although the latter syntax is not actually allowed.

Restrictions on parameters in a type

While semantic restrictions on types yield the full power of predicate subtyping, their intended use in SMT-LIB 3 is to document precisely the input domain over which a partial function or type constructor is defined. So they are meant to be used as formal documentation, not as the definition of a subtype. The type system SMT-LIB 3 remains without subtypes which means that semantic restrictions on types can be completely ignored for type checking purposes. That said, solvers can use type restriction information to provide more informative messages in case of scripts that use partial functions outside of their domain of definition. As an example, the integer division operator in Version 3 is defined as follows

  (declare-const div (→ Int (! Int :var n :restrict (distinct n 0)) Int) 
A solver could use the knowledge of the official restriction above for instance to issue a warning when it returns a model that gives value 0 to a term that occurs as the second argument of a div application in an assertion. Alternatively, it could try to determine statically, before solving an input problem, if all applications of partial functions in the problem are provably to values within the function's domain, and issue a warning otherwise.

As another example, the actual bit vector type in the Version 3 bit vector theory is defined as follows:

  (declare-type BitVec ((! Int :var m :restrict (> m 0)))
Since the parameter m expresses the size of the vector, the added restriction limits the value of m to the positive integers. (One could argue that the type constructor is well defined also for m = 0 but this is a discussion for another time).

Note that syntactic restrictions on types are orthogonal to semantic ones and have a different purpose: to restrict the set of expressible terms and formulas in a benchmark so as to achieve decidability of the satisfiability problem or some other computational objective — as done, for instance, in the various BV logics of SMT-LIB 2. Concretely, in modules corresponding to SMT-LIB logics (see later), the parameter m of the bit vector type above is further constrained to be a positive numeral, as opposed to an arbitrary Int term, as follows:

  (declare-type BitVec ((! Int :var m :restrict (> m 0) :syntax <pos_numeral>))
This means that, in those logics, the BitVec constructor can be applied only to positive numerals. The same mechanism is applied in logics of linear integer arithmetic to restrict applications of the multiplication operator to linear multiplications.

Commands

The SMT-LIB 3 language contains almost all commands in Version 2 as well as an additional number of new constructs and commands. A large number of them, however, are defined in terms of the core language above. This includes all the commands from Version 2, none of which change semantics in practice in Version 3. Most of them become syntactic sugar of core Version 3 commands and some of them will be deprecated and eventually phased out. For instance, declare-const is now a core command while declare-fun is not. The SMT-LIB 2.6 expression

  (declare-fun f (τ₁ ⋅⋅⋅ τᵢ) τ)
with i > 0 is now an abbreviation of
  (declare-const f (→ τ₁ ⋅⋅⋅ τᵢ τ))
Similarly,
  (define-fun f ((x₁ τ₁) ⋅⋅⋅ ((xᵢ τᵢ)) τ t)
  (define-fun-rec f ((x₁ τ₁) ⋅⋅⋅ ((xᵢ τᵢ)) τ t)
are now respective abbreviations of
  (define-const f (→ τ₁ ⋅⋅⋅ τᵢ τ) (lambda ((x₁ τ₁) ⋅⋅⋅ ((xᵢ τᵢ)) t))
  (define-const-rec f (→ τ₁ ⋅⋅⋅ τᵢ τ) (lambda ((x₁ τ₁) ⋅⋅⋅ ((xᵢ τᵢ)) t))
where define-const and define-const-rec are new commands.

In a similar vein, sorts are not primitive anymore and become types. Correspondingly, declare-sort is now a special case of the new core command declare-type. For instance,

  (declare-sort S n)
with n ≥ 0 now becomes an alternative syntax for
  (declare-type S (Type ⋅⋅⋅ Type))
with n occurrences of Type. Similarly,
  (define-sort S (u₁ ⋅⋅⋅ uᵢ) σ)
now becomes an alternative syntax for
  (define-type S ((u₁ Type) ⋅⋅⋅ (uᵢ Type)) σ)
Note that declare-type and define-type are more general than their Version 2 counterparts because they can introduce (value) dependent types as well, as in
  (declare-type Vector (Type (! Int :var m :restrict (>= m 0)))
  (define-type RVector ((n Int :restrict (> n 0))) (Vector Real n))
  (define-type RV8 () (RVector 8))

Binders

The primitive binders in Version 3 are

  • let, the parallel version of local definitions ((let ((x1 t1) ⋅⋅⋅ (xn tn)) t)), as in Version 2.6;
  • lambda, for function (λ) abstraction ((lambda ((x1 τ1) ⋅⋅⋅ (xn τn)) t)), a new binder;
  • forall, for universal quantification of types in formulas ((forall ((τ1 Type) ⋅⋅⋅ (τn Type)) φ)), a new binder;.
Another new binder is choose, for the Hilbert choice operator ε, although technically it is a second-order function that can be used with binder syntax. The let binder is the same as in Version 2 with the addition that now it can be used to define higher-order variables. The other binders above are new. The match binder for algebraic data types is defined in terms of let as before. The forall and exists quantifiers from Version 2, however, are not primitive anymore and are instead defined as higher-order functions. The syntax that uses them as binders becomes syntactic sugar for terms based on lambda. For instance, forall is now defined as the function
  (lambda ((A Type) (P (→ A bool))) (= P (lambda ((x A)) true)))))
of type (→ (! Type :var A :implicit) (→ A Bool) Bool)). The old syntax, with expressions of the form
  (forall ((x₁ τ₁) ⋅⋅⋅ (xᵢ τᵢ)) φ) 
is maintained but it is understood as an abbreviation of
  (forall (lambda ((x₁ τ₁) ⋅⋅⋅ (xᵢ τᵢ)) φ)) 
Similarly, the new binder syntax
  (choose (x τ) φ)
abbreviates
  (choose (lambda ((x τ)) φ))

Polymorphic definitions and assertions

[To do. Polymorphic functions can be introduced in declarations and in constant and let definitions, but they cannot be passed as arguments in function applications (let-polymorphism). More generally, the input type of each function symbol has to be a monotype. Asserted formulas have to be in prenex form with respect to universal type quantifiers. ]

New language constructs

Some of the new constructs for terms, such as lambda and choose, are specific to the move to higher-order logic. Other novel constructs are orthogonal to that extension and are motivated by a desire to unify in a single language the various sublanguages of SMT-LIB 2.6 for theory definitions, logic definitions, and command scripts. This is done by extending the command language to allow the definition of theory symbols in the same style as user-defined symbols and by introducing a notion of module that can be used to define both theories and logics.

Modules

Modules are a general construct to structure scripts in units with their own name space and provide a basic form of encapsulation and information hiding. For the medium term, modules will not be allowed in user scripts. The hope, however, is that with time they will be supported by SMT-solvers and so will eventually be available to regular users. For now, they will be used only to define SMT-LIB theories and logics.

; Example
(define-module M (
  (declare-type A ())
  (declare-type B ()) 
  (declare-const a A)
  (declare-const f (→ A B))  
  ; The scope (visibility) of the symbols A, B, a, and f is limited to module M
  )
)
; A, B, a, and f are not accessible here

Within a module, overloading of constant symbols is fully allowed. The same constant can be declared multiple times as long as it has a different type every time. More precisely, since we have polymorphic constants, a constant c can be given a new type τ (with a declare or define command) only if τ has no instances in common with any of the current types of c. For instance, if c has type (→ (! Type :var X) (Array Int X)), it cannot be re-declared later to have any of these types:

(→ (! Type :var Y) (Array Int Int))
(→ (! Type :var Y) (Array Y Int))
(→ (! Type :var X) (Array Int (Array Int X)))
because all of them have instances in common with (→ (! Type :var X) (Array Int X)). In contrast, it would be fine to re-declare c with type Int, say. In the latter case, c would become an overloaded symbol with two (principal) types: (→ (! Type :var X) (Array Int X)) and Int. (In fact, it would be fine even to redeclare c with type (Array Int Int), say, since the second c would have arity 0 instead of 1, making its type not an instance of the function (→ (! Type :var X) (Array Int X))). It is an error to have multiple declarations of the same constant that violate the policy above.

Note that (permitted) overloading of a constant c can make the constant or some of its applications ambiguous, in the sense of SMT-LIB 2.6 that the type of c or terms of the form (c t₁ ⋅⋅⋅ tᵢ) cannot be uniquely determined by bottom-up type inference. In those cases, the ascription operator as must be used to disambiguate the constant, as in Version 2.6. The main difference is that function types are first-class in Version 3 and so ascription now specifies the whole type for a function symbol (e.g., (as f (→ Int Bool)) instead of just its return type (e.g., (as f Bool)).

Consider for instance the following module:

(define-module M (
  (declare-type A ())
  (declare-type B ()) 
  (declare-const a A)
  (declare-const a B)
  (declare-const b B)
  (declare-const f (→ B A))
  (declare-const f (→ A A)) 
  (declare-const g (→ B A))
  (declare-const g (→ B B))
  )
)

The constants a, f, and g, for being overloaded, are ambiguous when used as an argument in an application, e.g., (= a b), or (= f g). Also ambiguous are applications of g (e.g., (g b)). However, applications of f (e.g., (f b)) are not. As a consequence, the uses of as in the assertions below are all necessary.

 (assert (= (as a A) (f b))) 
 (assert (= (as a B) b)) 
 (assert (= b ((as g (→ B B)) b))) 
 (assert (= (as f (→ B A)) (as g (→ B A))))
The different use of as in Version 3 represents one of the few changes that are not backward compatible with Version 2.6. However, this is not a serious concern since as has been used mostly for 0-arity constants where difference between the new and the old use disappears.

Module Interfaces

By default, all the symbols (i.e., constants and type constructors) declared and defined in a module M are public, i.e., visible by modules importing M. Optionally, however, it is possible to specify explicitly an interface for the module, which selects the symbols that are exported, that is, made visible. Interfaces are used to construct modules that correspond to logics in the sense of SMT-LIB 2. The interface is specified with two attributes in a module definition:

  • :types whose value is the list of exported type constructors with their associated kind, and
  • :consts whose value is the list of exported constants with their associated type.
(The type/kind constructor does not need to be exported because it a primitive symbol of the underlying logic.)

  ; Example
  (define-module M
   (
    (declare-type A ())    (declare-type B ())     (declare-type C ()) 
    (declare-const a A)    (declare-const b B)
    (declare-const f (→ A B))
    (declare-const g (→ B A)) 
    (define-const  h (→ A A) (lambda ((x A)) (g (f x))))
    (declare-const c (→ A C))
   )
   :types ( (A Type) (B Type) )
   :consts ( (a A) (f (→ A B)) (h (→ A A)) )
  )

In the example above, only two of the type constructors and three of the constants are exported. Note that the types of the exported constants must be constructible from the exported type constructors for the interface to be well formed. Also note that any relationship among the exported constants established in the module through non-exported constants (or through assertions) is maintained when the module is imported. For example, the formula

  (forall ((x1 A) (x2 A)) (=> (= (f x1) (f x2)) (= (h x1) (h x2)))
)
is valid not just in the module M above but also in any module that imports M. Concretely, this means that importing a module has always the effect of declaring and asserting everything in the module. The only effect of the interface is to prohibit in the importing module any direct reference to non-exported symbols.

The :types attribute is optional. Its absence causes all type constructors in the module to be exported. In contrast, giving it value () causes no type constructors to be exported. The same policy applies to :consts.

The reason constants are listed in a module's interface with an associated type is that its is possible to assign them a more restricted type than the one they have in the module. For instance, the array select function, which has type

  (→ (! Type :var I :implicit) (! Type :var E :implicit) (Array I E) I E) 
in the module declaring it, could be exported as follows:
  :types ( (Bool Type) (Int Type) (Array (→ Type Type Type)) ... ) 
  :consts ( (select (→ (! Type :implicit) (! Type :var E :implicit) (Array Int E) Int E))
            (select (→ (! Type :var I :implicit) (! Type :implicit) (Array I Bool) I Bool))
            ... ; constants other than select 
          )
This would limit the application of select only to values of the listed types. Note that the interface above is not exporting two select functions; it is exporting the same polymorphic function but with two (disjunctive) restrictions on its instances.

The exported type constructors can be restricted in similar way, disallowing in the importing modulo any terms whose types does not conform to the restrictions. For example, a module having a type constructor Seq (for generic sequences of len n) of kind: (→ (! Int :var n :restrict (>= n 0)) Type Type) could export it as follows:

  :types 
  ( (Seq (→ (! Int :var n :syntax <numeral> :restrict (even n)) Type Type)
    ... 
  )
which (cumulatively) restricts the application of Seq only to positive even numerals. The general rule is that the restrictions imposed on an exported symbol in a module interface are in addition to any restrictions already imposed on the symbol within the module. The meaning of multiple syntactic restrictions is the intersection of the languages denoted by the individual restrictions. The meaning of multiple semantic restrictions is the conjunction of the individual restrictions.

For polymorphic types, additional restrictions are expressible by instantiating type parameters. This can be done indirectly in module interfaces by defining a new type constructor as an instance of some polymorphic type, and then exporting the new constructor.

(define-module M (
  ...
  (define-type IntArray ((A Type)) (Array Int A))
  ...
  )
 :types ( (IntArray (→ Type Type)) )
 :consts ( ... )
)
disallow any array other than two dimensional arrays with integer indices.

A natural question is why add restrictions on an exported symbol (type constructor or constant) in the interface of a module and not directly inside the module when the symbol is declared. The reason is that a module can import a symbol from another module and export a restricted version of it. This approach is followed in defining modules that correspond to SMT-LIB 2 logics. Such modules import symbols from theory modules in their full generality and then export them with restrictions. For instance, a linear integer arithmetic module would import the arithmetic symbols from the module defining the integers and export the multiplication symbol with the additional restriction that at least one of its arguments is a (concrete) integer value:

  :consts 
  ( (* (→ (! Int :syntax <int_value>) Int Int))
    (* (→ Int (! Int :syntax <int_value>) Int)
    ... 
  )

Module Imports

Modules can be imported into another module or at the top level in an SMT-LIB 3 script with the new command import which lists all the modules to be imported. At most one import command is allowed in a module. Moreover, the command has to occur before any command that modifies the context (declarations, assertions, ...). At the top level, later import commands are allowed only if interleaved with reset commands. Because two modules can import the same module, it is possible to effectively import a module directly and indirectly several times. The effect of multiple imports is cumulative for symbols and disjunctive for their restrictions: if the same symbol is first imported with a restriction R₁ and then with a restriction R₂ then the disjunction of R₁ and R₂ is considered.

  ; Example
  [To do]

Qualified names

Every module automatically defines a name space corresponding to it. The name space is reflected in the generation of qualified names for the symbols exported by a module. Qualified names have the form M::n where M is the module's name and n is the name of a symbol declared in M and exported by M. Note that using :: as a separator in qualified names does not break backward compatibility because :: is not allowed in Version 2.6 identifiers.

  ; Example
  (import (Ints Reals))

  (declare-const n Ints::Int)
  (declare-const x Reals::Real)
  (define-const i Ints::Int (Ints::+ n Ints::2))
  (define-const y Reals::Real (Reals::+ x Reals::4.3))

A new command open can be used to allow unqualified names for the (exported) symbols of an imported module.

  ; Example 1
  (import (Ints Reals))
  (open Ints)
  (open Reals)
  (declare-const n Int)           ; Int is an alias of Ints::Int
  (declare-const x Real)          ; Real is an alias of Reals::Real
  (define-const i Int (+ n 2))    ; + and 2 are aliases of Ints::+ and Ints::2
  (define-const y Real (+ x 4.3)) ; + and 4.3 are aliases of Reals::+ and Reals::4.3
Note how opening two modules creates the possibility of overloading of at the level of the importing module, as is the case with open in the example above.
  ; Example 2
  (define-module M1 (
  (declare-type A ())
  (declare-type B ())
  (declare-type C ()) 
  (declare-const a A)
  (declare-const f (→ A B)) 
  (declare-const P (→ A B Bool))
  (assert (P b (f a)))
 )
 :types (A B)
 :consts ((a A) (f (→ A B)))
)
(import (M1))
; all exported symbols of M1 are now visible with their fully qualified name
; all assertions in M1 are now in the assertion context.
(declare-const a1 M1::A) ; declares a1 in the top-level namespace
(assert (= a1 M1::a))    ; adds this equality to the assertion context
(open M1) 
; now all symbols of M1 can be used without qualification
(declare-const b2 B) 
(assert (= b2 (f a))) ; B, f and a are from M1

The effect of opening a module M is to create a local alias for each symbol exported by M. In the example above, (open M1) can be understood as an abbreviation of

  (define-type A () M1::A)
  (define-type B () M1::B)
  (define-const a A M1::a)
  (define-const f (→ A B) M1::f)

A module M importing a module N can export the symbols exported by N as if they were its own. It has the option, however, to export them with stronger restrictions. The name used in M's interface to export N's symbols can be its fully qualified name (with the prefix N::) or its short version if N is opened in M.

  ; Example
  [To do]

An important point is that all module imports declare their symbols at the top level, which means that qualified names are never nested. Importing a module M at the top level that in turn imports a module N internally has the effect of first importing N at the top level and then M. However, since M is the module being directly imported at the top level, the only symbols of N visible (that is, usable in later commands) are those re-exported by M, if any. If M opens N, the module prefix of the fully qualified names of the re-exported symbols of N is indifferently M:: or N::. Concretely, if s is a type constructor or constant exported by N and then re-exported by M, it is accessible at the top level as N::s or as M::s, or simply as s once M has been opened; it is not accessible as M::N::c. (The latter would be the case if N's definition itself was inside M's which is, however, not allowed since all modules are defined at the top level.)

Import Graph

Considering the top level as an implicit (special) module, let us say that a module N is imported directly by a module M if N appears explicitly in an import command in M. We say that N is imported indirectly by M if N is imported, directly or indirectly, by a module imported by M. Circular import dependencies are disallowed, that is, a module cannot import itself, directly or indirectly. In other words, the "directly imports" relation always defines a direct acyclic graph over modules. Without interleaving calls to the reset command, a module N can be imported directly at most once but can be imported indirectly many times as a result of importing several modules that in turn import N. Another consequence on the import structure is that a module's symbol can be imported in a module multiple times and with different restrictions. The effect of two imports of symbols from the same module is additive: symbols imported with the second import that had not already been imported with the first are added to the top level; however, no already imported symbols are removed. If the same dependent constant c is imported once with a (restricted) type τ₁ and then with a (restricted) type τ₂ then it can be used with either type. (Note that c is not overloaded in this case because type τ₁ and τ₂ both are restrictions of the same principal type that c has in the module it originally was introduced in.)

  ; Example
  [To do]

Inductive Data Types

Algebraic data types of SMT 2.6 are generalized to inductive data types and their definition gets a new syntax. The old syntax is accepted but deprecated and will be eventually phases out.
The new syntax is best explained with a few examples. The old syntax:
  (declare-datatypes ((Size 0) (BinTree 0) (Option 1) (List 1) (Pair 2))
    ( ; Size
      ((small) (medium) (large))
      ; BinTree
      ((empty) (node (left BinTree) (right BinTree)))
      ; Option
      (par (V) ((none) (some (val V)) ))
      ; List
      (par (E) ((nil) (cons (head E) (tail (List E)))))
      ; Pair
      (par (A B) ((pair (first A) (second B))))
    )
  )
stands for new syntax:
  (define-inductive-types 
    ((Size ()) (BinTree ()) (Option (Type)) (List (Type)) (Pair (Type Type)))
    ( ; Size
      ( (small Size)
        (medium Size)
        (large Size)
      )
      ; BinTree
      ( (empty BinTree)
        (node (-> (! BinTree :selector left) (! BinTree :selector right) BinTree))
      )
      ; Option
      ( (none (→ (! Type :var A)   ; Type input is explicit for this constructor
                 (Option A)))    
        (some (→ (! Type :var A :implicit) 
                 (! A :selector val) (Option A)))
      )
      ; List
      ( (nil (→ (! Type :var E)    ; Type input is explicit
                (List E))) 
        (cons (→ (! Type :var E :implicit) 
                   (! E :selector head) (! (List E) :selector tail)) (List E))
      )
      ; Pair
      ( (pair (→ (! Type :var A :implicit) (! Type :var B :implicit) 
                 (A :selector first) (B :selector second) (Pair A B))))
      )
    )
  )

The command define-inductive-types takes two arguments:

  1. a list of pairs of consisting of a type constructor (the name of the inductive type) and a possibly empty list of the form (Type ... Type) for each type constructor, indicating the number of type parameters it takes;
  2. a corresponding list of value constructors for each inductive type.
The value constructors for an inductive type are themselves grouped in a list. Each element of this list is a pair of a constructor and its type. The type is given with the same syntax as any other (possibly functional) type. An optional :selector attribute for one of the arguments of the constructor can be used to name the corresponding selector.

Several constraints on the type definition ensure the generality and the well-foundedness of the type.

  • If a datatype D being defined has n parameters, each constructor of that datatype must return a value whose type has the form (D X₁ ... Xᵢ) where X₁ ... Xᵢ are i distinct type variables. Consequently, the constructor must have those type variables as (implicit or explicit) input parameters.
  • An inductive type can be parametrized by other types but not values. So it can be polymorphic but not dependent in general.
  • If a datatype D being defined has n parameters, each constructor of that datatype must return a value whose type has the form (D X₁ ... Xᵢ) where X₁ ... Xᵢ are i distinct type variables. Consequently, the constructor must have those type variables as (implicit or explicit) input parameters.
  • The constructors of the the same datatype must have distinct names. However, the same name can be used for two constructors of distinct inductive types.
  • An inductive type can be parametrized by other types but not values. So it can be polymorphic but not dependent in general. This is a temporary restriction that may be lifted at a later time.
  • No nested types are allowed. That is, the type of a constructor's argument for an inductive type D cannot be a type term that nests D within another type constructor. For instance, it is not possible to define a parametric inductive type (Tree A) with a constructor of type (→ A (List (Tree A)) (Tree A)) or (→ A (Array Int (Tree A)) (Tree A)) because (Tree A) occurs nested in List (resp. Array).
  • The same well-foundedness constraints as in 2.6 (which enable the type inhabited).

Note that specifying selectors is not mandatory anymore. The reason is that thanks to the match binder they can eliminated from any formulas without loss of generality.

Each inductive type D implicitly creates its own namespace similarly to modules. The fully qualified name for a constructor or selector f for a inductive type D is D::f. For instance, for the constructor the Pair type defined above id Pair::pair and the selectors are Pair::first and Pair::second. Short names can be used after opening the type, as with modules, with the open command:

  (open Pair)
The same rule on overloading apply as for modules.

Two new commands are provided as abbreviation of the define-inductive-types. The first one, define-inductive-type, can be used to define a single inductive type, as in:

  (define-inductive-type (List (Type)) 
    ( (nil (→ (! Type :var E) (List E))) 
      (cons (→ (! Type :var E :implicit) 
               (! E :sel head) (! (List E) :sel tail)) (List E))
    )
  )
The previous command is understood as an abbreviation of:
  (define-inductive-types ( (List (Type)) ) 
    ( ( (nil (→ (! Type :var E) (List E))) 
        (cons (→ (! Type :var E :implicit) 
                 (! E :sel head) (! (List E) :sel tail)) (List E))
      )
    )
  )
The second new command, provides a light-weight syntax for datatypes that have only constant constructors, as in:
  (define-enumeration-type Size (small medium large))
which abbreviates the more verbose
  (define-inductive-types ( (Size ()) )
    ( ((small Size) (medium Size) (large Size)) )
  )

Testers and recursors

Every inductive type definition induces an implicit definition of testers for each constructor as well as general recursor for the type. For example, for the List type introduced above there would be the following tester functions

  (List::is-nil (→ (! Type :var E :implicit) (List E) Bool))
  (List::is-cons (→ (! Type :var E :implicit) (List E) Bool))
as well as the recursor List::reduce defined as if it had been introduced as follows
  (define-fun-rec reduce 
    ( (A Type :implicit) (B Type :implicit)
      (b B) (comb (→ A (List A) B B))) (l (List A)) ) B
    (match l 
      ( (nil b)
        ((cons h t) (comb h t (reduce b comb t)))
      )
    ) 
  )

For example, the recursor List::reduce reduces any list l of type (List A) to a value of type B with the aid of a value b of type B for the case where l is empty, and of a function comb that computes the reduction in the non-empty case from the head h and tail t of l and the (recursively computed) reduction of t. A wealth of functions over lists can be defined in terms of List::reduce, including the following:
  (define-const sum ((l (List Int))) Int
    (List::reduce 0 (lambda ((n Int) (t (List Int)) (s Int)) (+ n s)) l) 
  )
  (define-fun len ((A Type :implicit) (l (List A))) Int
    (List::reduce 0 (lambda ((a A) (t (List A)) (n Int)) (+1 n)) l)
  )
  (define-fun append ((A Type :implicit) (l1 (List A))) (l2 (List A)) (List A)
    (List::reduce l2 (lambda ((a A) (t (List A)) (r (List A))) (cons a r)) l1)
  )
  (define-fun reverse ((A Type :implicit) (l (List A))) (List A)
    (List::reduce (nil A) 
      (lambda ((a A) (t (List A)) (r (List A))) (append r (cons a (nil A)))) l)
  )
  (define-fun filter ((A Type :implicit) (p (-> A Bool)) (l (List A))) (List A)
    (List::reduce (nil A) 
      (lambda ((a A) (t (List A)) (r (List A))) (ite (p a) (cons a r) r)) l)
  )
  (define-fun map ((A Type :implicit) (B Type :implicit) 
                   (f (→ A B)) (l (List A))) (List A)
    (List::reduce (nil A) 
      (lambda ((a A) (t (List A)) (r (List B))) (ite (p a) (cons a r) r)) l)
  )

External Symbols

When defining new theories there is sometimes the need for two theories to refer to each other's symbols. This need cannot be addressed with imports because it would create a circular dependency. To address this issue while keeping theories in separate modules the language allows the definition of external types and constants in a module.

More generally, one imports a module A in a module B if B is meant to be an extension of A. If, on the other hand, A and B are conceptually separate modules where, however, one module may need to use a type or a constant in the other, then they can be introduced as external. Types constructors and constants in module can be declared as external with an :extern attribute which connects them to symbols exported by another module.

; Example
(define-module Lists (
  (declare-datatypes ((List 1))
    ((nil (→ (! Type :var A) (List A)))
     (cons (→ (! Type :var A :implicit) 
              (! A :selector car) (! (List A) :selector cdr)) (List A))
    )
  )  
  ...
 (declare-type Int () :extern Ints::Int)
 (declare-const 0 Int :extern Ints::0)
 (declare-const +1 (→ Int Int) :extern Ints::+1)
 (define-fun len ((A Type :implicit) (l (List A))) Int
   (List::reduce 0 (lambda ((a A) (t (List A)) (n Int)) (+1 n)) l)
)
...
 )
 :types ((→ (! Type E) (List E)))
 :consts ((len (→ (Type :var E :implicit) (List E) Ints::Int)) 
          ...
         )
)

Using the :extern attribute, the type Int in the module Lists above is declared to be external and a proxy for Ints::Int. The constants 0 and +1 are introduced similarly. In general, symbols declared as :extern in a module M cannot be exported by M. This entails that if M exports a symbol s that depends on external symbols from a module N, a module importing M can use s in a command only if it imports N too. [To do: define precisely what it means for a symbol to depend on external ones.] In the example above, since Lists exports a function whose type and definition depend on external symbols from module Ints, any module importing Lists must also import Ints so that the reference to the external symbols in Lists can be resolved.

  ; Example 1
  (import (Lists))
  ; Since Lists::len depends on Ints, it cannot be used here
  ...
  ; Example 2
  (import (Ints Lists))
  ; Lists::len can be used here
  ...

Syntactic categories for :syntax restrictions

[To do: definition of <int_value> <pos_numeral> and so on via an attribute grammar. Syntax categories can be defined in modules and should be exported like types and constants.].

SMT-LIB 3 Theories

The standard SMT-LIB 3 theories are now defined just as modules. A first draft of the new definition for some of the standard theories in Version 2.6 is available below. Note that the modules are still at the pre-proposal stage. They are incomplete and subject to change. However, they should offer a pretty good idea of the expressive power of SMT-LIB 3. The theories are defined intentionally to be rather rich. Any restriction on theory signature and language is left to modules that define restricted logics.

In the current formalization, the theories below are essentially conservative extensions of the corresponding SMT-LIB 2.6 theories, that is, they have more symbols and richer types but do not change the semantics of the old ones in any meaningful way.

; Core SMT-LIB theory

(define-module Core ( 
  (set-info :smt-lib-version 3.0)
  (set-info :smt-lib-release "2020-XX-XX")
  (set-info :written-by "Pascal Fontaine and Cesare Tinelli")
  (set-info :date "2020-04-24")
  (set-info :last-updated "2020-04-24")
  (set-info :update-history
   "TODO.
   "
  )

  ;---------------
  ; Builtin types
  ;--------------- 
  ; Booleans
  (declare-type Bool () :builtin
   "The (binary) set {true, false}")

  ; Could be defined as a datatype in alternative

  ;-------------------
  ; Builtin constants
  ;------------------- 
   ; Equality  
  (declare-const = (-> (! Type :var A :implicit) 
                       A A Bool)
   :chainable 
   :builtin 
   "= denotes the identity function, which returns true iff 
    its two arguments are identical.")

  ;-----------------------
  ; Axiomatized constants
  ;----------------------- 
  ; true, false
  (define-const true Bool)
  (declare-const false Bool)
  ; ite
  (declare-const ite 
    (-> (! Type :var A :implicit) 
           Bool A A A))
  ; ite is if-then-else
  (assert (forall ((A Type)) (= (ite true) (lambda ((x A) (y A)) x))))
  (assert (forall ((A Type)) (= (ite false) (lambda ((x A) (y A)) y))))
  ; true and false are distinct
  (assert (= (= true false) false))

  ;-------------------
  ; Defined constants
  ;------------------- 
  ; not
  (define-const not (-> Bool Bool) 
    (lambda ((b Bool)) (ite b false true)))
  ; distinct
  (define-const distinct (-> (! Type :var A :implicit) A A Bool) 
    (lambda ((A Type :implicit) (x A) (y A)) (not (= x y))) :pairwise)
  ; or
  (define-const or (-> Bool Bool Bool) 
    (lambda ((b1 Bool) (b2 Bool)) (ite b1 true b2)) :left-assoc)
  ; and
  (define-const and (-> Bool Bool Bool) 
    (lambda ((b1 Bool) (b2 Bool)) (ite b1 b2 false)) :left-assoc)
  ; implies
  (define-const => (-> Bool Bool Bool) 
    (lambda ((b1 Bool) (b2 Bool)) (ite b1 b2 true)) :right-assoc)
  ; xor
  (define-const xor (-> Bool Bool Bool) 
    (lambda ((b1 Bool) (b2 Bool)) (distinct b1 b2)) :left-assoc)
  ; forall
  (define-const forall 
    (-> (! Type :var A :implicit)
        (-> A Bool) Bool)
    (lambda ((A Type :implicit) (P (-> A Bool)))
      (= P (lambda ((x A)) true))))
  ; Note: This forall quantifies only over individuals, not over types.
  ;       A forall quantifier over types is provided as a primitive binder.

  (set-info :notation 
   "(forall ((x τ)) φ) abbreviates 
    (forall (lambda ((x₁ τ₁)) φ))
  
    (forall ((x₁ τ₁) ⋅⋅⋅ (xᵢ τᵢ)) φ) abbreviates 
    (forall ((x₁ τ₁)) ⋅⋅⋅ (forall ((xᵢ τᵢ)) φ))

    (forall ((τ₁ Type) ⋅⋅⋅ (τⱼ Type) (x₁ τ₁) ⋅⋅⋅ (xᵢ τᵢ)) φ) abbreviates 
    (forall ((τ₁ Type) ⋅⋅⋅ (τⱼ Type))
      (forall ((x₁ τ₁)) ⋅⋅⋅ (forall ((xᵢ τᵢ)) φ)))
   ")
  ; exists
  (define-const exists 
    (-> (! Type :var A :implicit) 
        (-> A Bool) Bool) 
    (lambda ((A Type :implicit) (P (-> A Bool)))
      (distinct P (lambda ((x A)) false))))
  ;
  (set-info :notation 
   "(exists ((x τ)) φ) abbreviates 
    (exists (lambda ((x₁ τ₁)) φ))

    (exists ((x₁ τ₁) ⋅⋅⋅ (xᵢ τᵢ)) φ) abbreviates 
    (exists ((x₁ τ₁)) ⋅⋅⋅ (exists ((xᵢ τᵢ)) φ))
   ")

  ; Hilbert's choice (ε)
  (declare-const choose 
    (-> (! Type :var A :implicit)
        (-> A Bool) A))
  ;
  ; choose is axiomatically defined  
  (assert (forall ((A Type) (P (-> A Bool))) 
    (= (exists P) (P (choose P)))))
  ;
  (set-info :notation 
   "(choose (x τ) φ)  abbreviates  (choose (lambda ((x τ)) φ))
   ")

  ;-------------
  ; Bool values
  ;-------------
  ; The set of values of type Bool is {true, false}.
  (define-syntax (⟨bool_value⟩)
    ; production rules
    ((⟨bool_value⟩ (true false))))

  (define-values Bool ⟨bool_value⟩)

  ; compose
  (define-fun compose
   ((A Type :implicit)
    (B Type :implicit)
    (C Type :implicit)
    (f (-> B C))
    (g (-> A B))
   )
   (-> A C)
   (lambda ((a A)) (f (g a)))
   :note 
    "compose is not declared left-associative on purpose otherwise
     there would be ambiguity between (compose f g h) where f g and h 
     are functions to be composed and (compose f g a) where a is 
     the input of g."
 )

; ; pipe
;   (define-const pipe
;     (-> (! Type :var A :implicit)
;         (! Type :var B :implicit)
;         A (-> A B) B)
;     (lambda ((A Type) (B Type) (a A) (f (-> A B)))
;       (f a))
;     :left-assoc  ; this is actually problematic without type inference:
;     ; (pipe a f g) = (pipe (pipe a f) g)      ((pipe a f) g)
;     ; (pipe a f g) = (pipe (pipe a f) g) 
;     )


; identity
  (define-const iden
    (-> (! Type :var A :implicit) A A)
    (lambda ((A Type) (a A)) a))

  ;apply
  (define-const apply
    (-> (! Type :var A :implicit)
        (! Type :var B :implicit)
        (-> A B) A B)
    (lambda ((A Type) (B Type) (f (-> A B))) f)
    :left-assoc ; (apply g a b) = (apply (apply g a) b) 
  )

 

  ;-----------------------
  ; Abstract value syntax
  ;-----------------------
  (define-syntax 
    ; non-terminals
    ( ⟨abstract_value⟩ ⟨bool_value⟩ )
    ; production rules
    ( (⟨bool_value⟩ (true false))
      (⟨abstract_value⟩ ((! ⟨symbol⟩ :restrict "starts with @")))))

  (define-values () Bool ⟨bool_value⟩)


  ;-------------------
  ; Import Attributes
  ;-------------------

     ; FOL
  (set-info :FOL
   "The import attribute :FOL restricts Core
      to the first-order applicative fragment:
      - no lambdas
      - no partial applications: 
        the first argument of an application can only be a constant
      - all arguments of an application have a type of order 0
      - can only declare or define constants whose type has order <= 1
      - can use forall, exists, and choose only in their
        abbreviated form.
   ")

  (set-info :no-new-types 
   "The import attribute :no-new-types disallows the use  
    of the declare-type command.")

  (set-info :no-new-functions 
   "The import attribute :no-new-functions disallows the declaration 
    of constants of arrow type.")
))

 
(raw file)
(define-module Ints (
  (set-info :smt-lib-version 3.0)
  (set-info :smt-lib-release "2020-XX-XX")
  (set-info :written-by "Pascal Fontaine and Cesare Tinelli")
  (set-info :date "2020-05-01")
  (set-info :last-updated "2020-05-01")
  (set-info :update-history
   "TODO.
   "
  )
  (set-info :notes 
   "Integer arithmetic")
 
  (import (Core))
  (open Core)

  ;---------------
  ; Builtin types
  ;--------------- 
  ; Int
  (declare-type Int () :builtin 
  "Int denotes the set of all integer numbers.")

  ;-------------------
  ; Builtin constants
  ;------------------- 
  ; 0, 1, 2, ...
  (declare-const ⟨numeral⟩ Int :builtin 
   "Each numeral denotes the corresponding integer as usual.") 
  ; successor
  (declare-const +1 (-> Int Int) :builtin
   "+1 denotes the integer successor function")
  ; predecessor
  (declare-const -1 (-> Int Int) :builtin
   "-1 denotes the integer predecessor function")
  ; <
  (declare-const < (-> Int Int Bool) :builtin
   "< denotes the standard strict total ordering over the integers."
   :chainable)
  ;

  ;-----------------------------------
  ; Axiomatized and defined constants
  ;----------------------------------- 

  ; <=
  (define-const <= (-> Int Int Bool) (lambda ((x Int) (y Int)) 
    (or (= x y) (< x y)))
   :chainable)
  ; >
  (define-const > (-> Int Int Bool) (lambda ((x Int) (y Int))
    (< y x))
   :chainable)
  ; >=
  (define-const >= (-> Int Int Bool) (lambda ((x Int) (y Int))
    (or (= x y) (> x y)))
   :chainable)
  ; unary -
  (define-const-rec - (-> Int Int) (lambda ((x Int))
    (ite (= x 0) 0
      (ite (< x 0) (+1 (- (+1 x))) 
        (-1 (- (-1 x)))))))
  ; +
  (define-const-rec + (-> Int Int Int) (lambda ((x Int) (y Int)) 
    (ite (= y 0) x 
      (ite (< y 0) (-1 (+ x (+1 y))) 
        (+1 (+ x (-1 y))))))
   :left-assoc)
  ; binary -
  (define-const - (-> Int Int Int) (lambda ((x Int) (y Int)) 
    (+ x (- y)))
   :left-assoc
   :no-partial ; this constant cannot be applied partially 
   )
  ; * 
  (define-const-rec * (-> Int Int Int) (lambda ((x Int) (y Int)) 
    (ite (= y 0) 0
      (ite (< y 0) (- (* x (- y))) 
        (+ x (* x (-1 y))))))
   :left-assoc)
  ; abs
  (define-const abs (-> Int Int) (lambda ((x Int))
    (ite (>= x 0) x (- x))))
  ; div 
  (declare-const div (-> Int (! Int :var n :restrict (distinct n 0)) Int) 
   :left-assoc)
  ; mod
  (declare-const mod (-> Int Int Int))
  ; div/mod axiom
  (assert (forall ((m Int) (n Int)) (=> (distinct n 0)
    (let ((q (div m n)) (r (mod m n)))
      (and (= m (+ (* n q) r))
           (<= 0 r (-1 (abs n))))))))
      
  ; divisible by constant n
  (define-const divisible (-> Int (! Int :var n :restrict (> n 0)) Bool)
    (lambda ((m Int) (n Int)) (= (mod m n) 0)))


  (set-info :notes
   "The axiomatization of div and mod is based on [1].
   
    Regardless of the sign of m, 
    - when n is positive, (div m n) is the floor of the rational number m/n;
    - when n is negative, (div m n) is the ceiling of m/n.

    This contrasts with alternative but less robust definitions of div and mod
    where (div m n) is 
    - always the integer part of m/n (rounding towards 0), or 
    - always the floor of x/y (rounding towards -infinity).

    [1] Boute, Raymond T. (April 1992). 
        The Euclidean definition of the functions div and mod. 
        ACM Transactions on Programming Languages and Systems (TOPLAS) 
        ACM Press. 14 (2): 127 - 144. doi:10.1145/128861.128862.
   ")

  ; summation
  (define-const-rec sum (-> Int Int (-> Int Int) Int)
    (lambda ((a Int) (b Int) (f (-> Int Int)))
      (ite (> a b) 
        0 
        (+ (f a) (sum (+1 a) b f)))))

  ;-------- 
  ; Values
  ;--------
  ; The set of values for type Int consists of 
  ;  - all numerals,
  ;  - all terms of the form (- n) where n is a numeral other than 0.
  (define-syntax 
    ; non-terminals
    (⟨int_value⟩ ⟨pos_numeral⟩) 
    ; production rules
    ((⟨int_value⟩ (⟨numeral⟩
                   (- ⟨pos_numeral⟩)))
     (⟨pos_numeral⟩ ((! ⟨numeral⟩ :syntax-restrict "other than 0")))))

  (define-values () Int ⟨int_value⟩)


  ;-------------------------------
  ; External and Bridging symbols
  ;-------------------------------
  ; Real
  (declare-type Real () :extern Reals::Real)

  (declare-const to_int (-> Real Int) :builtin 
   "to_int denotes the function that maps each real number r 
    to its integer part, that is, to the largest integer n 
    that satisfies (<= (to_real n) r).")
))

 
(raw file)
(define-module Reals (
  (set-info :smt-lib-version 3.0)
  (set-info :smt-lib-release "2020-XX-XX")
  (set-info :written-by "Pascal Fontaine and Cesare Tinelli")
  (set-info :date "2020-07-01")
  (set-info :last-updated "2020-07-01")
  (set-info :update-history
   "TODO.
   "
  )
  (set-info :notes  "Real arithmetic")

  (import (Core))
  (open Core)

  ;---------------
  ; Builtin types
  ;---------------
  ; Real
  (declare-type Real () :builtin
   "Real denotes the set of all real numbers."
  )

  ;-------------------------------
  ; External and Bridging symbols
  ;-------------------------------
  ; Int
  (declare-type Int () :extern Ints::Int)

  (declare-const to_real (-> Int Real) :builtin 
   "to_real as the standard injection of the integers into the reals.")

  ;-------------------
  ; Builtin constants
  ;------------------- 
  ; Decimals
  (declare-const ⟨decimal⟩ Real :builtin
   "Each ⟨decimal⟩ denotes the corresponding real number as usual.")

  ; rational constants defined further down

   ; Unary -
  (declare-const - (-> Real Real) :builtin "Negation.")
  ;  
  ; +
  (declare-const + (-> Real Real Real) :left-assoc :builtin "Addition.")

  ; *
  (declare-const * (-> Real Real Real) :left-assoc :builtin "Multiplication.")

  ; <=
  (declare-const <= (-> Real Real Bool) :chainable :builtin 
   "Usual non-strict total ordering over the reals.")

   ; is_int
  (declare-const is_int (-> Real Bool) :builtin
   "is_int maps whole real numbers to true and all the other reals to false.")

  ;-----------------------
  ; Axiomatized constants
  ;----------------------- 

  ; subtraction -
  (define-const - (-> Real Real Real) (lambda ((x Real) (y Real))
    (+ x (- y))
   :left-assoc))

  ; division /
  (declare-const / (-> Real Real Real) :left-assoc :restrict (distinct y 0.0))
  (assert (forall ((x Real) (y Real)) (=> (distinct y 0.0)
    (= x (* y (/ x y))))))
    
  ; <
  (define-const < (-> Real Real Bool) (lambda ((x Real) (y Real))
    (and (<= x y) (distinct x y)))
   :chainable)

  ; >
  (define-const > (-> Real Real Bool) (lambda ((x Real) (y Real))
    (< y x))
   :chainable)

  ; >=
  (define-const >= (-> Real Real Bool) (lambda ((x Real) (y Real))
    (<= y x))
   :chainable)

  ; Summation 
  ; we use a step different from 1 and provide it as additional input
  (define-const-rec sum (-> Real Real (-> Real Real) Real)
    (lambda ((a Real) (b Real) (f (-> Real Real)))
      (ite (> a b) 
        0.0 
        (+ (f a) (sum (+ a 1.0) b f)))))

  ; rational constants
  (define-const / 
    (-> (! Int :value) 
        (! Int :var d :value :restrict (> (to_real d) 0.0))
        Real) 
    (lambda ((n Int) (d Int)) 
      (/ (to_real n) (to_real d))))


  ;-------------
  ; Real values
  ;-------------

  ; The set of values for the sort Real consists of 
  ;  - an abstract value for each irrational algebraic number
  ;  - all decimals of the form n.0
  ;  - all terms of the form (- n.0) where n is a ⟨numeral⟩ other than 0
  ;  - all terms of the form (/ m n) or (- (/ m n)) where 
  ;    - m is a ⟨numeral⟩ other than 0,
  ;    - n is a ⟨numeral⟩ other than 0 and 1,
  ;    - as integers, m and n have no common factors besides 1.
  (define-syntax 
    ; non-terminals
    (⟨real_value⟩ ⟨p⟩ ⟨f⟩ ⟨m⟩ ⟨n⟩) 
    ; production rules
    ((⟨real_value⟩ (⟨abstract⟩ ; for irrational algebraic numbers
                    0.0
                    ⟨p⟩
                    (- ⟨p⟩)
                    ⟨f⟩
                    (- ⟨f⟩))
     (⟨f⟩ ((! (/ ⟨m⟩ ⟨n⟩) :syntax-restrict 
                          "as integers, ⟨m⟩ and ⟨n⟩ have no common factors besides 1")))
     (⟨p⟩ ((! ⟨decimal⟩ :syntax-restrict "it has the form n.0 except for 0.0")))
     (⟨m⟩ ((! ⟨numeral⟩ :syntax-restrict "it is not 0")))
     (⟨n⟩ ((! ⟨numeral⟩ :syntax-restrict "it is neither 0 nor 1"))))))

  (define-values () Real ⟨real_value⟩)

  (set-info :notes
   "In Real values decimals are used only for whole numbers, e.g., 42.0. 
   For all other rational numbers the fractional notation is necessary, e.g., (/ 1 2).
   This way there is unique value for each rational."
  )

  (set-info :notes
   "The restriction of Reals over the signature having just the symbols 
    (0.0 Real)
    (1.0 Real)
    (- Real Real)
    (+ Real Real Real) 
    (* Real Real Real)
    (<= Real Real Bool)
    (<  Real Real Bool)
    coincides with the theory of real closed fields, axiomatized by
    the formulas below: 

    - associativity of +
      (forall ((x Real) (y Real) (z Real)) (= (+ (+ x y) z) (+ x (+ y z))))

    - commutativity of +
      (forall ((x Real) (y Real)) (= (* x y) (* y x)))

    - 0.0 is the right (and by commutativity, left) unit of +
      (forall ((x Real)) (= (+ x 0.0) x))

    - right (and left) inverse wrt +
      (forall ((x Real)) (= (+ x (- x)) 0.0))

    - associativity of *
      (forall ((x Real) (y Real) (z Real)) (= (* (* x y) z) (* x (* y z))))

    - commutativity of *
      (forall ((x Real) (y Real)) (= (* x y) (* y x)))

    - 1.0 is the right (and by commutativity, left) unit of *
      (forall ((x Real)) (= (* x 1.0) x))

    - existence of right (and left) inverse wrt *
      (forall ((x Real)) (or (= x 0) (exists (y Real) (= (* x y) 1.0))))

    - left distributivity of * over +
      (forall ((x Real) (y Real) (z Real)) (= (* x (+ y z)) (+ (* x y) (* x z))))

    - right distributivity of * over +
     (forall ((x Real) (y Real) (z Real)) (= (* (+ x y) z) (+ (* x z) (* y z))))

    - non-triviality
      (distinct 0.0 1.0)

    - all positive elements have a square root
      (forall ((x Real)) (exists ((y Real))
        (or (= x (* y y)) (= (- x) (* y y)))))

    - axiom schemas for all n > 0
      (forall ((x_1 Real) ... (x_n Real))
        (distinct (- 1) (+ (* x_1 x_1) (+ ... (* x_n x_n))))) 

    - axiom schemas for all odd n > 0 where (^ y n) abbreviates
      the n-fold product of y with itself
      (forall ((x_1 Real) ... (x_n Real)) (exists ((y Real))
        (= 0 (+ (^ y n) (+ (* x_1 (^ y n-1)) (+  ... (+ (* x_{n-1} y) x_n)))))))

    - reflexivity of <=
      (forall ((x Real)) (<= x x))

    - antisymmetry of <=
      (forall ((x Real) (y Real)) (=> (and (<= x y) (<= y x)) (= x y)))

    - transitivity of <=
      (forall ((x Real) (y Real) (z Real)) (=> (<= x y z)) (<= x z))

    - totality of <=
      (forall ((x Real) (y Real)) (or (<= x y) (<= y x)))
    
    - monotonicity of <= wrt +
      (forall ((x Real) (y Real) (z Real)) (=> (<= x y) (<= (+ x z) (+ y z))))

    - monotonicity of <= wrt *
      (forall (x Real) (y Real) (z Real) 
        (=> (and (<= x y) (<= 0 z)) (<= (* z x) (* z y))))

    - definition of <
      (forall ((x Real) (y Real)) 
        (= (< x y) (and (<= x y) (distinct x y))))

    References:
    1) W. Hodges. Model theory. Cambridge University Press, 1993.
    2) PlanetMath, http://planetmath.org/encyclopedia/RealClosedFields.html
  ")

))
 
(raw file)
;; **Make sure that all operators in QF_BV.smt2 have been added**

(define-module FixedSizeBitVectors (
  (set-info :smt-lib-version 3.0)
  (set-info :smt-lib-release "2020-XX-XX")
  (set-info :written-by "Pascal Fontaine and Cesare Tinelli")
  (set-info :date "2020-05-01")
  (set-info :last-updated "2020-07-01")
  (set-info :update-history
   "TODO.
   "
    )
  (set-info :notes
   "This theory declaration defines a core theory for fixed-size bitvectors 
    where the operations of concatenation and extraction of bitvectors as well 
    as the usual logical and arithmetic operations are overloaded.
   ")
 
   (import (Core))
   (open Core)

  ;------------------
  ; External symbols
  ;------------------

  (declare-type Int () :extern Ints::Int)

  (declare-const ⟨numeral⟩ Int :extern Ints::⟨numeral⟩)

  (declare-const 0 Int :extern Ints::0)
  (declare-const 1 Int :extern Ints::1)
  (declare-const +1 (-> Int Int) :extern Ints::+1)
  (declare-const + (-> Int Int Int) :extern Ints::+)
  (declare-const - (-> Int Int Int) :extern Ints::-)
  (declare-const * (-> Int Int Int) :extern Ints::*)
  (declare-const < (-> Int Int Bool) :extern Ints::<)
  (declare-const > (-> Int Int Bool) :extern Ints::>)

  ;---------------
  ; Builtin types
  ;---------------

  ; definition as a dependent type 
  (declare-type BitVec ((! Int :var m :restrict (<= 0 m)))
   :builtin 
   "(BitVec m) denotes the set of all bitvectors of size m >= 0.
   More precisely, it denotes the set of finite functions
   whose domain is the initial segment [0, m) of the naturals, starting at
   0 (included) and ending at m (excluded), and whose co-domain is {0, 1}.")

  (set-info :notes
   "(BitVec 0) contains just one element, the empty vector, corresponding to
    the only function in the function space {} -> {0,1}, the empty function.
   "
  )
				   
  (set-info :notes
   "To define some of the semantics below, we use the following mathematical
    functions:

    o _ div _,  which takes an integer x ≥ 0 and an integer y > 0 and returns
     the integer part of x divided by y (i.e., truncated integer division).

    o _ rem _, which takes an integer x ≥ 0 and y > 0 and returns the
      remainder when x is divided by y.  Note that we always have the following
      equivalence for y > 0: (x div y) * y + (x rem y) = x.

    o bv2nat, which takes a bitvector b: [0, m) → {0, 1}
      with 0 < m, and returns an integer in the range [0, 2ᵐ),
      and is defined as follows:

        bv2nat(b) := b(m-1) * 2ᵐ⁻¹ + b(m-2) * 2ᵐ⁻² + ⋯ + b(0) * 2⁰

      Note that bv2nat(b) = 0 for the empty bitvector b: [0, 0) → {0, 1}.

    o nat2bv[m], with 0 <= m, which takes a non-negative integer
      n and returns the (unique) bitvector b: [0, m) → {0, 1}
      such that

        b(m-1) * 2ᵐ⁻¹ + ⋯ + b(0) * 2⁰ = n rem 2ᵐ

      Note that nat2bv[0](n) is the empty vector b: [0, 0) → {0, 1}
      for every natural n.

    The semantic interpretation of terms of type (BitVec m) is provided
    by the function ⟦_⟧ used in the definition below for the builtin operators.
   ")

  (set-info :notes
   "In builtin definitions below, |exp| is used denote the integer  
    resulting from the evaluation of the arithmetic expression exp.
   ")

  ;--------------------
  ; Bridging functions
  ;--------------------

  ; to_int
  (declare-const to_int
    (-> (! Int :var m :implicit :restrict (>= m 0)) 
        (BitVec m) Int)
   :built-in 
    "Standard (BitVec m) to Int conversion, mapping each vector 
     to a number in the interval [0,2^m)"
  )

  ; to_bv from Int
  (declare-const to_bv 
    (-> (! Int :var m :restrict (<= 0 m))  ; explicit argument
        Int (BitVec m))
   :built-in 
   "Int to (BitVec m) conversion mapping each integer n 
    to the 2's complement representation of (n modulo m)." ; to be double-checked
  )

  ; to_bv from Bool
  (define-const to_bv (-> Bool (BitVec 1))
    (lambda ((b Bool)) (ite b #b1 #b0)))

  ;-------------------
  ; Constant literals
  ;-------------------

  ; Empty bitvector
  (declare-const bvempty (BitVec 0))

  ; Bitvector literals in binary form (such as #b10110)
  (declare-const ⟨binary⟩
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit 
         :restrict "m is the number of binary digits in the constant") 
        (BitVec m))
   :builtin
   "The constant symbols #b0 and #b1 of sort (BitVec 1) are defined as follows

    ⟦#b0⟧ := λx:[0, 1). 0
    ⟦#b1⟧ := λx:[0, 1). 1

    More generally, given a string #b followed by a sequence of 0's and 1's,
    if n is the numeral represented in base 2 by the sequence of 0's and 1's
    and m is the length of the sequence, then the term represents
    nat2bv[m](n).
   ")

  ; Bitvector literals in hexadecimal form (such as #xB93AD)
  (declare-const ⟨hexadecimal⟩
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit
         :restrict "m is 4 times the number of hexadecimal digits in the constant") 
        (BitVec m))
   :builtin
   "Let h be the constant. Then h denotes nat2bv[m](n) where n is the numeral 
    represented in hexadecimal (base 16) by h.
    For example, #xFF is equivalent to #b11111111.
   ")

				     
  ;--------------------
  ; Sequence operators
  ;--------------------

  ; concat
  (declare-const concat 
    (-> (! Int :var i :syntax ⟨numeral⟩ :implicit :restrict (<= 0 i)) 
        (! Int :var j :syntax ⟨numeral⟩ :implicit :restrict (<= 0 j)) 
        (! Int :var k :syntax ⟨numeral⟩ :implicit :restrict (= k (+ i j))) 
        (BitVec i) (BitVec j)
        (BitVec k))
   :builtin
   "- For all terms s and t of type (BitVec 0) and (BitVec j), respectively,
      ⟦(concat s t)⟧ := ⟦t⟧
    - For all terms s and t of type (BitVec i) and (BitVec j) with i > 0,
      ⟦(concat s t)⟧ := λx:[0, i+j). (if (x < j) then ⟦t⟧(x) else ⟦s⟧(x - j))
 ")

  ; extract
  (declare-const extract
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (< 0 m)) 
        (! Int :var i :syntax ⟨numeral⟩ :restrict (< i m)) 
        (! Int :var j :syntax ⟨numeral⟩ :restrict (<= 0 j i))
        (! Int :var k :syntax ⟨numeral⟩ :implicit :restrict (= k (+1 (- i j))))
        (BitVec m) (BitVec k))
   :builtin
   "For all terms t of type (BitVec m),
    ⟦(extract i j t)⟧ := λx:[0, m). ⟦t⟧(j + x)
   ")

  (set-info :notes
   "extract extracts from a _non-empty_ bitvector a sub-bitvector 
    from position i down to position j.
    The spec could be refined to have the function return the empty vector 
    when i is smaller than j.
   ")

  ; repeat
  ; (repeat i x) means concatenate i copies of x
  (define-fun-rec repeat 
    ((i Int :syntax ⟨numeral⟩ :restrict (<= 0 i)) 
     (m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
     (n Int :syntax ⟨numeral⟩ :implicit :restrict (= n (* m i))) 
     (x (BitVec m))
    ) 
    (BitVec n)
    (ite (= i 0) bvempty (concat x (repeat (- i 1) x)))
  )

  ;-------------------- 
  ; Bit-wise operators
  ;-------------------- 

  ; bvnot
  (declare-const bvnot
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m))
   :builtin
   "- For all terms s of type (BitVec 0),
      ⟦(bvnot s)⟧ := ⟦bvempty⟧
    - For all terms s of type (BitVec m) with m > 0,
      ⟦(bvnot s)⟧ := λx:[0, m). if ⟦s⟧(x) = 0 then 1 else 0
   ")

  ; bvand
  (declare-const bvand
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m))
   :left-assoc
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvand s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m) with m > 0,
      ⟦(bvand s t)⟧ := λx:[0, m). if ⟦s⟧(x) = 0 then 0 else ⟦t⟧(x)
   ")

  ; bvor
  (declare-const bvor
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
   :left-assoc
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvor s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m) with m > 0,
      ⟦(bvor s t)⟧ := λx:[0, m). if ⟦s⟧(x) = 1 then 1 else ⟦t⟧(x)
   ")

  ; bvnand
  (define-const bvnand
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
         (BitVec m) (BitVec m) (BitVec m)) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m))
             (s (BitVec m)) (t (BitVec m))) 
      (bvnot (bvand s t)))
  )

  ; bvnor
  (define-const bvnor
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
             (s (BitVec m)) (t (BitVec m))) 
      (bvnot (bvor s t)))
  )

  ; bvxor
  (define-const bvxor
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
             (s (BitVec m)) (t (BitVec m))) 
      (bvor (bvand s (bvnot t)) (bvand (bvnot s) t)))
   :left-assoc
  )

  ; bvxnor
  (define-const bvxnor
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
             (s (BitVec m)) (t (BitVec m))) 
      (bvor (bvand s t) (bvand (bvnot s) (bvnot t))))
  )

  ;bvcomp
  (declare-const bvcomp
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec 1))
    (lambda ((m Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m))
             (s (BitVec m)) (t (BitVec m)))
      (to_bv (= s t)))
  )
;   :builtin
;   "For all terms s, t of type (BitVec m),
;    (bvcomp s t) is equivalent to 
;     - (bvxnor s t) 
;       if m = 1
;     - (bvand (bvxnor ((extract |m-1| |m-1|) s) ((extract |m-1| |m-1|) t))
;         (bvcomp ((extract |m-2| 0) s) ((extract |m-2| 0) t))) 
;       otherwise
;   ")

  ;----------------------
  ; Arithmetic operators
  ;----------------------

  ; bvneg
  (declare-const bvneg
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m))
   :builtin
   "- For all terms s of type (BitVec 0),
      ⟦(bvneg s)⟧ := ⟦bvempty⟧
    - For all terms s of type (BitVec m),
      ⟦(bvneg s)⟧ := nat2bv[m](2ᵐ - bv2nat(⟦s⟧))
   ")

  ; bvadd
  (declare-const bvadd
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
   :left-assoc
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvadd s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m),
      ⟦(bvadd s t)⟧ := nat2bv[m](bv2nat(⟦s⟧) + bv2nat(⟦t⟧))
   ")

  ; bvmul
  (declare-const bvmul
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
   :left-assoc
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvmul s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m),
      ⟦(bvmul s t)⟧ := nat2bv[m](bv2nat(⟦s⟧) * bv2nat(⟦t⟧))
   ")

  ; bvudiv
  (declare-const bvudiv
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvudiv s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m),
      ⟦(bvudiv s t)⟧ := if bv2nat(⟦t⟧) = 0
                        then λx:[0, m). 1
                        else nat2bv[m](bv2nat(⟦s⟧) div bv2nat(⟦t⟧))
   ")

  ; bvurem
  (declare-const bvurem
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvurem s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m),
      ⟦(bvurem s t)⟧ := if bv2nat(⟦t⟧) = 0 
                       then ⟦s⟧
                       else nat2bv[m](bv2nat(⟦s⟧) rem bv2nat(⟦t⟧))
   ")

  ; bvsub
  (define-const bvsub
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m))
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
             (s (BitVec m)) (t (BitVec m))) 
      (bvadd s (bvneg t)))
   )

  ;----------------------------- 
  ; Signed arithmetic operators
  ;----------------------------- 

  ; bvsdiv
  (declare-const bvsdiv
      (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
          (BitVec m) (BitVec m) (BitVec m)) 
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvsdiv s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m),
      (bvsdiv s t) is equivalent to 
        (let ((?msb_s ((extract |m-1| |m-1|) s))
              (?msb_t ((extract |m-1| |m-1|) t)))
         (ite (and (= ?msb_s #b0) (= ?msb_t #b0))
              (bvudiv s t)
         (ite (and (= ?msb_s #b1) (= ?msb_t #b0))
              (bvneg (bvudiv (bvneg s) t))
         (ite (and (= ?msb_s #b0) (= ?msb_t #b1))
              (bvneg (bvudiv s (bvneg t)))
              (bvudiv (bvneg s) (bvneg t))))))
   ")

  ; bvsrem
  (declare-const bvsrem
      (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
          (BitVec m) (BitVec m) (BitVec m)) 
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvsrem s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m),
      (bvsrem s t) is equivalent to  
       (let ((?msb_s ((extract |m-1| |m-1|) s))
             (?msb_t ((extract |m-1| |m-1|) t)))
         (ite (and (= ?msb_s #b0) (= ?msb_t #b0))
              (bvurem s t)
         (ite (and (= ?msb_s #b1) (= ?msb_t #b0))
              (bvneg (bvurem (bvneg s) t))
         (ite (and (= ?msb_s #b0) (= ?msb_t #b1))
              (bvurem s (bvneg t)))
              (bvneg (bvurem (bvneg s) (bvneg t))))))
   ")

;;;; REVISE DEFINITION [what is (bv0 m) ?]  TODO
  ; bvsmod
  (declare-const bvsmod
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) (BitVec m)) 
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvsmod s t)⟧ := ⟦bvempty⟧
    - For all terms s, t of type (BitVec m),
      (bvsmod s t) is equivalent to
       (let ((?msb_s ((extract |m-1| |m-1|) s))
             (?msb_t ((extract |m-1| |m-1|) t)))
         (let ((abs_s (ite (= ?msb_s #b0) s (bvneg s)))
               (abs_t (ite (= ?msb_t #b0) t (bvneg t))))
            (let ((u (bvurem abs_s abs_t)))
             (ite (= u (bv0 m))
                  u
             (ite (and (= ?msb_s #b0) (= ?msb_t #b0))
                  u
             (ite (and (= ?msb_s #b1) (= ?msb_t #b0))
                  (bvadd (bvneg u) t)
             (ite (and (= ?msb_s #b0) (= ?msb_t #b1))
                   (bvadd u t)
                  (bvneg u))))))))
   ")

  ;-------------------------------
  ; Unsigned Comparison operators
  ;-------------------------------

  ; bvult
  (declare-const bvult
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool) 
   :chainable
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvult s t)⟧ := false
    - For all terms s, t of type (BitVec m),
      ⟦(bvult s t)⟧ := true iff bv2nat(⟦s⟧) < bv2nat(⟦t⟧)
   ")
 
  ; bvule
  (define-const bvule
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool) 
    (lambda ((m Int :implicit :restrict (<= 0 m))
	     (s (BitVec m)) (t (BitVec m))) 
      (or (bvult s t) (= s t)))
   )

  ; bvugt
  (define-const bvugt
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m))
             (s (BitVec m)) (t (BitVec m))) 
      (bvult t s))
   )

  ; bvuge
  (define-const bvuge
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m))
             (s (BitVec m)) (t (BitVec m))) 
      (bvule t s))
   )

  ;-----------------------------
  ; Signed Comparison operators
  ;-----------------------------

  ; bvslt
  (declare-const bvslt
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool)
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvslt s t)⟧ := false
    - For all terms s, t of type (BitVec m),
      (bvslt s t) is equivalent to  
       (or (and (= ((extract |m-1| |m-1|) s) #b1)
                (= ((extract |m-1| |m-1|) t) #b0))
           (and (= ((extract |m-1| |m-1|) s) ((extract |m-1| |m-1|) t))
                (bvult s t)))
   ")

  ; bvsle
  (declare-const bvsle
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool)
   :builtin
   "- For all terms s, t of type (BitVec 0),
      ⟦(bvsle s t)⟧ := false
    - For all terms s, t of type (BitVec m),
      (bvsle s t) is equivalent to 
       (or (and (= ((extract |m-1| |m-1|) s) #b1)
                (= ((extract |m-1| |m-1|) t) #b0))
           (and (= ((extract |m-1| |m-1|) s) ((extract |m-1| |m-1|) t))
                (bvule s t)))
   ")

  ; bvsgt
  (define-const bvsgt
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m))
	     (s (BitVec m)) (t (BitVec m))) 
      (bvslt t s))
   )
  
  ; bvsge
  (define-const bvsge
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m) Bool) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m))
             (s (BitVec m)) (t (BitVec m))) 
      (bvsle t s))
   )   

  ;----------------- 
  ; Shift operators
  ;----------------- 

  ; bvshl
  (declare-const bvshl
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (! Int :var n :syntax ⟨numeral⟩ :implicit :restrict (<= 0 n)) 
        (BitVec m) (BitVec n) (BitVec m))
   :builtin
   "For all terms s and t of type (BitVec m) and (BitVec n), respectively,
    ⟦(bvshl s t)⟧ := nat2bv[m](bv2nat(⟦s⟧) * 2ᵏ)  where k = bv2nat(⟦t⟧)
   ")
 
  ; bvlshr  
  (declare-const bvlshr
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (! Int :var n :syntax ⟨numeral⟩ :implicit :restrict (<= 0 n)) 
    (BitVec m) (BitVec n) (BitVec m))
   :builtin
   "For all terms s and t of type (BitVec m) and (BitVec n), respectively,
    ⟦(bvlshr s t)⟧ := nat2bv[m](bv2nat(⟦s⟧) div 2ᵏ) where k = bv2nat(⟦t⟧)
   ")

  ;------------------------- 
  ; Miscellaneous operators
  ;------------------------- 

  ; zero_extend
  (declare-const zero_extend
    (-> (! Int :var i :syntax ⟨numeral⟩ :restrict (<= 0 i)) 
        (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (! Int :var n :syntax ⟨numeral⟩ :implicit :restrict (= n (+ m i))) 
        (BitVec m) (BitVec n))
   :builtin
   "For all terms t of type (BitVec m),
    (zero_extend i t) means extend t with zeroes to the (unsigned) bitvector
    of size m+i that has the same (unsigned) integer value.
   ")

  ; sign_extend
  (declare-const sign_extend
    (-> (! Int :var i :syntax ⟨numeral⟩ :restrict (<= 0 i)) 
        (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (! Int :var n :syntax ⟨numeral⟩ :implicit :restrict (= n (+ m i))) 
        (BitVec m) (BitVec n))
   :builtin
   "For all terms t of type (BitVec m),
    (sign_extend i t) means extend to the (signed) bitvector of size m+i
    that has the same (signed) integer value as t.
   ")

  ; rotate_left
  (declare-const rotate_left
    (-> (! Int :var i :syntax ⟨numeral⟩ :restrict (<= 0 i)) 
        (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m))
   :builtin
   "For all terms t of type (BitVec m),
    (rotate_left i t) means rotate the bits of t to the left i times.
   ")
 
  ; rotate_right
  (declare-const rotate_right
    (-> (! Int :var i :syntax ⟨numeral⟩ :restrict (<= 0 i)) 
        (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m))
   :builtin
   "For all terms t of type (BitVec m),
    (rotate_right i t) means rotate the bits of t to the right i times.
   ")
 
  ; pop_count
  (declare-const pop_count
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec m))
  :builtin
  "For all terms t of type (BitVec m),
   (pop_count t) is the bitvector whose unsigned integer value is equal 
   to the number of 1s in t. 
  ")
   
  ; reduce_and
  (declare-const reduce_and
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec 1))
  :builtin
  "For all terms t of type (BitVec m),
   (reduce_and t) returns the logical conjunction of all the bits in t.
   Note that the result is #b1 if m is 0. 
  ")

  ; reduce_or
  (declare-const reduce_or
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec 1))
  :builtin
  "For all terms t of type (BitVec m),
   (reduce_or t) returns the logical disjunction of all the bits in t.
   Note that the result is #b0 if m is 0. 
  ")

  ; reduce_xor
  (declare-const reduce_xor
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec m) (BitVec 1))
  :builtin
  "For all terms t of type (BitVec m),
   (reduce_xor t) returns the logical xor of all the bits in t.
   Note that the result is #b0 if m is 0. 
   ")

  ; bvite
  (define-const bvite
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
        (BitVec 1) (BitVec m) (BitVec m) (BitVec m)) 
    (lambda ((m Int :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m))
             (x (BitVec 1)) (y (BitVec m)) (z (BitVec m))) 
      (ite (= x #b1) y z))
   )

; TODO:   A (BV 1) version of all comparison ops
; PF TODO fix all functions
  (declare-const bv1ult
    (-> (! Int :var m :syntax ⟨numeral⟩ :implicit :restrict (<= 0 m)) 
      (BitVec m) (BitVec m) (BitVec 1))
    (lambda ((m Int :implicit :restrict (<= 0 m))
             (s (BitVec m)) (t (BitVec m)))
      (to_bv (bvult s t)))
   :notes
   "This function cannot be chainable like its boolean counterpart
    since it does not return a boolean
   ")

  ;------------------
  ; Bitvector values
  ;------------------

  (define-syntax 
    ; non-terminals
    (⟨bitvector_value⟩) 
    ; production rules
    ((⟨bitvector_value⟩ (bvempty ⟨binary⟩))))

  (define-values ((n Int :restrict (<= 0 n)))
    (BitVec n) 
    (! ⟨bitvector_value⟩ 
     :syntax-restrict 
     "bvemtpy when n = 0 and otherwise only binaries #bX with n digits"))
				     
))

 
(raw file)
(define-module Strings (
  (set-info :smt-lib-version 3.0)
  (set-info :smt-lib-release "2020-XX-XX")
  (set-info :written-by "Pascal Fontaine and Cesare Tinelli")
  (set-info :date "2020-07-01")
  (set-info :last-updated "2020-07-01")
  (set-info :update-history
   "TODO.
   "
    )
  (set-info :notes
    "This is a theory of character strings and regular expressions over an
     alphabet consisting of Unicode characters. It is not meant to be used in
     isolation but in combination with Ints, the theory of integer numbers.")

  (import (Core))
  (open Core)

  (set-info :notes
    "To define some of the semantics below, we use the following mathematical
     notation:

     - The semantic interpretation of terms of type String and Lan is provided
       by the function ⟦_⟧ used in the definition below for the builtin
       operators.

     - UC denotes the set of all integers from 0x000000 to 0xFFFFFF,
       representing a superset of the set of all code points for Unicode
       characters at the time of this writing. 

     - UC* is the set of all words, in the sense of universal algebra, 
       with juxtaposition denoting the concatenation operator here. 
       Note: Character positions in a word are numbered starting at 0.

     - If w is a word in UC*, |w| denotes the number of characters, i.e., 
       elements, in w.")


  ;------------------
  ; External symbols
  ;------------------

  ; Int
  (declare-type Int () :extern Ints::Int)

  ; (BitVec m)
  (declare-type BitVec ((! Int :var m :syntax ⟨int_value⟩ :restrict (> m 0)))
   :extern FixedSizeBitVectors::BitVec)

  (declare-const ⟨hexadecimal⟩
    (-> (! Int :var m :syntax ⟨int_value⟩ :restrict (> m 0) :implicit
         :restrict "m is 4 times the number of hexadecimal digits in the constant") 
        (BitVec m))
    :extern FixedSizeBitVectors::⟨hexadecimal⟩)

  ;---------------
  ; Builtin types
  ;---------------

  (declare-type String () :builtin
    "Strings denotes the set of unbounded strings of Unicode characters.
     Formally, ⟦String⟧ = UC* ")

  (declare-type Lan () :builtin
    "Lan denotes the set of all languages over Unicode strings.
     Formally, ⟦Lan⟧ is the powerset of ⟦String⟧. 
     Each element of ⟦Lan⟧ can be seen as a language with alphabet UC. 
     Each variable-free term of type Lan denotes a regular language in ⟦Lan⟧.")

  (set-info :notes
    "There was consensus in the community that having a character type is not
     really necessary and in fact complicates the theory. So the only way to
     express characters is to use strings of length 1.")

  ; In string fields below (which are double-quote-delimited) we cannot write
  ; something like "abc" to denote a string constant, we must use ""abc""
  ; instead.

  (set-info :notes
    "Because of SMT-LIB's own escaping conventions, string literals are written 
     in quadruple quotes, as in ""abc"", in textual fields here.")

  ;-----------
  ; Constants
  ;-----------

  ; string constants for _singleton_ strings, 
  ; i.e., strings consisting of exactly one character 

  (declare-const char
    (-> (! Int :var m :syntax ⟨int_value⟩ :implicit
               :restrict (or (= m 4) (= m 8) (= m 12) (= m 16) (= m 20) (= m 24)))
        (! (BitVec m) :value)
        String)
    :builtin
    "For all h, (char h) denotes a string of length 1 whose only character 
     is the Unicode character with code point n where n is the (unsigned) 
     integer represented by h. We identify Unicode characters
     with their code point, but expressed as a hexadecimal.
     For instance, 
     - (char #x2B) denotes the string ""+"" whose only character has code
       point 0x00002B (PLUS SIGN); 
     - (char #x27E8) denotes the string ""⟨"" whose only character has code
       point 0x0027E8 (MATHEMATICAL LEFT ANGLE BRACKET).")
 
  (set-info :notes
    "The argument of char can have at most 6 (hexadecimal) characters
     Ex:  (char #xA)  (char #x4E)  (char #x123)  (char #x1BC3D)
          (char #x01BC3D)")

  (set-info :notes
    "Because of leading zeros, the same one-character string is denoted  
     in more than one way. 
     Example: (char #x2B), (char #x02B), (char #x002B), (char #x0002B)
              and (char #x00002B)")

  (set-info :notes
    "The singleton string constants represent a superset of all the Unicode
     code points (from 0x000000 to 0x10FFFF) at the time of this writing.

     References:
     - https://www.unicode.org/main.html
     - http://www.utf8-chartable.de/
     - https://www.compart.com/en/unicode/")

  (set-info :notes
    "Rationale for the chosen notation for singleton string constants:
     Because of their large range, Unicode code points are typically given in
     hexadecimal notation. Using a hexadecimal directly to denote the
     corresponding character, however, would create an overloading problem in
     logics that combine this theory with that of bitvectors since hexadecimals
     denote bitvectors there.
     Using them as indices instead avoids this problem.")

  ; String constants
  (declare-const ⟨string⟩ String
    :restrict
    "All double-quote-delimited string literals consisting of printable US ASCII
     characters, i.e., those with Unicode code point from 0x000020 to 0x00007E.
     Arbitrary Unicode characters can be represented with _escape sequences_ 
     which can have one of the following forms 
      \ud₃d₂d₁d₀
      \u{d₀}
      \u{d₁d₀}
      \u{d₂d₁d₀}
      \u{d₃d₂d₁d₀}
      \u{d₄d₃d₂d₁d₀}
      \u{d₅d₄d₃d₂d₁d₀}
     where each dᵢ is a hexadecimal digit.
     This is enough to allow characters from all Unicode planes at the time of 
     this writing.
     The restriction to printable US ASCII characters in string constants is for
     simplicity since that set is universally supported."
    :builtin
    "1. The empty string constant """" is interpreted as the empty word ε of UC*.

     2. Each string constant containing a single (printable) US ASCII character
        is interpreted as the word consisting of the corresponding Unicode
        character code point.
      
        Ex: ⟦""m""⟧ = ⟦(char #x6D)⟧ = 0x00006D
            ⟦"" ""⟧ = ⟦(char #x20)⟧ = 0x000020

     3. Each string constant of the form ""\ud₃d₂d₁d₀"" where each dᵢ is a
        hexadecimal digit is interpreted as the word consisting of just the 
        character with code point 0xd₃d₂d₁d₀

        Ex: ⟦""\u003A""⟧ = ⟦(char #x3A)⟧ = 0x00003A

     4. Each literal of the form ""\u{d₀}"" (resp., ""\u{d₁d₀}"", ""\u{d₂d₁d₀}"",
        ""\u{d₃d₂d₁d₀}"", ""\u{d₄d₃d₂d₁d₀}"" or ""\u{d₅d₄d₃d₂d₁d₀}""), where each 
        dᵢ is a hexadecimal digit, is interpreted as the word consisting 
        of just the character with code point 0xd₀ (resp., 0xd₁d₀, 0xd₂d₁d₀, 
        0xd₃d₂d₁d₀, 0xd₄d₃d₂d₁d₀, or 0xd₅d₄d₃d₂d₁d₀).

        Ex: ⟦""\u{3A}""⟧ = ⟦(char #x3A)⟧ = 0x00003A

     5. ⟦l⟧ = ⟦l₁⟧⟦l₂⟧  if l does not start with an escape sequence and can be 
        obtained as the concatenation of a one-character string literal l₁ and
        a non-empty string literal l₂.

        Ex: ⟦""a\u02C1""⟧ = ⟦""a""⟧⟦""\u02C1""⟧ = 0x000061 0x0002C1
            ⟦""\u2CA""⟧ = 0x00005C ⟦""u2CXA""⟧        (not an escape sequence)
            ⟦""\u2CXA""⟧ = 0x00005C ⟦""u2CXA""⟧       (not an escape sequence)
            ⟦""\u{ACG}A""⟧ = 0x00005C ⟦""u{ACG}A""⟧   (not an escape sequence)

     6. ⟦l⟧ = ⟦l₁⟧⟦l₂⟧  if l can be obtained as the concatenation of string
        literals l₁ and l₂ where l₁ is an escape sequence and l₂ is non-empty.

        Ex: ⟦""\u02C1a""⟧ = ⟦""\u02C1""⟧⟦""a""⟧ = 0x0002C1 ⟦""a""⟧
            ⟦""\u{2C}1a""⟧ = ⟦""\u{2C}""⟧⟦""1a""⟧ = 0x00002C ⟦""1a""⟧

     Note: Character positions in a string literal are numbered starting at 0, 
           with escape sequences treated as a single character – consistently
           with their semantics.

           Ex.: In ""a\u1234T"", character a is at position 0, the character 
                corresponding to ""\u1234"" is at position 1, and character T is
                at position 2.")

  (set-info :notes
    "Observe that the first form, \ud₃d₂d₁d₀, has exactly 4 hexadecimal digit, 
     following the common use of this form in some programming languages.
     Unicode characters outside the range covered by \ud₃d₂d₁d₀ can be
     represented with the long forms \u{d₄d₃d₂d₁d₀} and \u{d₅d₄d₃d₂d₁d₀}.
  
     Also observe that programming language-specific escape sequences, such as
     \n, \b, \r and so on, are _not_ escape sequences in this theory as they
     are not fully standard across languages.

     SMT-LIB 2.6 has one escape sequence of its own for string literals. Two
     double quotes ("") are used to represent the double-quote character within 
     a string literal such as the one containing this very note. That escape 
     sequence is at the level of the SMT-LIB frontend of a solver, not at the 
     level of this theory.")

  ;---------------
  ; String values
  ;---------------
  (define-syntax 
    ; non-terminals
    (⟨string_value⟩) 
    ; production rules
    ((⟨string_value⟩ ((! ⟨string⟩ :syntax-restrict 
                                   "Same restriction as for string constants")))))
 
  (define-values () String ⟨string_value⟩)


  (set-info :notes
    "The set of values for String and Lan could be restricted further, to remove
     some redundancies. For instance, we could disallow leading zeros in escape
     sequences.
     For Lan, we could insist on some level of normalization for regular
     expression values. These restrictions are left to future versions.")

  (set-info :notes
    "All function symbols in this theory denote *total* functions, i.e., they
     are fully specified by the theory. This is achieved by returning _error_
     values for inputs where the intended functions are undefined.
     Error outputs are always outside of the range of the intended function,
     so there is no confusion with non-error outputs.")


  ;----------------
  ; Core functions
  ;----------------

 ; String functions

  ; Strings::++
  ; String concatenation
  (declare-const ++ (-> String String String) :builtin
    " ⟦++⟧ is the word concatenation function."
    :left-assoc)

  ; Lexicographic ordering
  (declare-const < (-> String String Bool) :builtin
    "⟦<⟧(w₁, w₂) is true iff w₁ is smaller than w₂ in the lexicographic 
     extension to UC* of the standard numerical < ordering over UC.

     Note: The order induced by < corresponds to alphabetical order
     for strings composed of characters from the alphabet of a western language 
     such as English:
     ⟦(< ""a"" ""aardvark"" ""aardwolf"" ... ""zygomorphic"" ""zygotic"")⟧ = true"
    :chainable)

  ; Functions on languages

  ; String to language injection
  (declare-const to_lan (-> String Lan) :builtin
    "⟦to_lan⟧(w) = { w } .
    ")

  (set-info :notes
    "Function to_lan allows one to write _symbolic regular expressions_, 
     e.g., Lan terms with subterms like (to_lan x) where x is a variable. 
     Such terms have more expressive power than regular expressions. This is 
     intentional, for future developments.
     The restriction to actual regular expressions will be imposed in a logic
     where to_lan will be applicable to string constants only.
    ")

  ; Language membership
  (declare-const in (-> String Lan Bool) :builtin
    "⟦in⟧(w)(L) = true iff w ∈ L .")

  ; Constant denoting the empty set of strings
  (declare-const empty Lan :builtin
    "⟦empty⟧ = ∅ .")

  ; Constant denoting the set of all strings 
  (declare-const univ Lan :builtin
    "⟦univ⟧ = ⟦String⟧ = UC* .")

  ; Constant denoting the set of all strings of length 1
  (declare-const allchar Lan :builtin
    "⟦allchar⟧ = { w ∈ UC* | |w| = 1 } .")
      
  ; Language concatenation
  (declare-const ++ (-> Lan Lan Lan) :builtin
    "⟦++⟧(L₁)(L₂) = { w₁w₂ | w₁ ∈ L₁ and w₂ ∈ L₂ } ."
    :left-assoc)

  ; Language union
  (declare-const union (-> Lan Lan Lan) :builtin
    "⟦union⟧(L₁)(L₂) = { w | w ∈ L₁ or w ∈ L₂ } ."
    :left-assoc)

  ; Language intersection
  (declare-const inter (-> Lan Lan Lan) :builtin
    "⟦inter⟧(L₁)(L₂) = { w | w ∈ L₁ and w ∈ L₂ } ."
    :left-assoc)

  ; Kleene Closure
  (declare-const * (-> Lan Lan) :builtin
    "⟦*⟧(L) is the smallest subset K of UC* such that
     1. ε ∈ K
     2. ⟦++⟧(L)(K) ⊆ K
    ")

  ;--------------------
  ; Bridging functions
  ;--------------------

  ; String length
  (declare-const len (-> String Int) :builtin
    "⟦len⟧(w) = |w|  (the number of characters, i.e., elements of UC, in w).")

  (set-info :notes
    "⟦len⟧(w) is **not** the number of bytes used by some Unicode encoding, such
     as UTF-8 – that number can be greater than the number of characters. 

     ⟦len(""\u1234"")⟧ is 1 since every escape sequence denotes a single
     character.")

  ;----------------------
  ; Additional functions
  ;----------------------

  ; Reflexive closure of lexicographic ordering
  (define-const <= (-> String String Bool)
    (lambda ((x String) (y String)) (or (= x y) (< x y)))
    :chainable)

  ; Substring
  ; (substr s i n) evaluates to the longest (unscattered) substring
  ; of s of length at most n starting at position i.
  ; It evaluates to the empty string if n is negative or i is not in 
  ; the interval [0,|s|-1].
  (declare-const substr (-> String Int Int String) :builtin
    "⟦substr⟧(w)(m)(n) is the unique word w₂ such that
       for some words w₁ and w₃
       - w = w₁w₂w₃ 
         - |w₁| = m
         - |w₂| = min(n, |w| - m)  if 0 <= m < |w| and 0 < n
       - ⟦substr⟧(w, m, n) = ε   otherwise")

  (set-info :notes
    "The second part of this definition makes ⟦substr⟧ a total function.")

  ; Singleton string containing a character at given position 
  ; or empty string when position is out of range.
  ; The leftmost position is 0.
  (define-const at (-> String Int String)
    (lambda ((x String) (n Int)) (substr x n 1)))

  ; First string is a prefix of second one.
  ; (prefixof s t) is true iff s is a prefix of t.
  (define-const prefixof (-> String String Bool)
    (lambda ((x String) (y String))
      (exists ((z String)) (= x (++ y z)))))

  ; First string is a suffix of second one.
  ; (suffixof s t) is true iff s is a suffix of t.
  (define-const suffixof (-> String String Bool)
    (lambda ((x String) (y String))
      (exists ((z String)) (= x (++ z y)))))

  ; First string contains second one
  ; (contains s t) iff s contains t.
  (define-const contains (-> String String Bool)
    (lambda ((x String) (y String))
      (exists ((z1 String) (z2 String)) (= x (++ z1 y z2)))))

  ; Index of first occurrence of second string in first one starting at 
  ; the position specified by the third argument.
  ; (indexof s t i), with 0 <= i <= |s| is the position of the first
  ; occurrence of t in s at or after position i, if any. 
  ; Otherwise, it is -1. Note that the result is i whenever i is within
  ; the range [0, |s|] and t is empty.
  (declare-const indexof (-> String String Int Int) :builtin 
    "⟦indexof⟧(w)(w₂)(i) is the smallest n such that for some words w₁, w₃
     - w = w₁w₂w₃
     - i <= n = |w₁| if ⟦contains⟧(w)(w₂) = true and i >= 0
     - ⟦indexof⟧(w,w₂,i) = -1  otherwise")

  ; Replace
  ; (replace s t t') is the string obtained by replacing the first
  ; occurrence of t in s, if any, by t'. Note that if t is empty, the
  ; result is to prepend t' to s; also, if t does not occur in s then
  ; the result is s.
  (declare-const replace (-> String String String String) :builtin 
    "- ⟦replace⟧(w)(w₁)(w₂) = w          if ⟦contains⟧(w, w₁) = false
     - ⟦replace⟧(w)(w₁)(w₂) = u₁w₂u₂
       where u₁ is the shortest word such that w = u₁w₁u₂
                                         if ⟦contains⟧(w, w₁) = true")

  ; (replace_all s t t’) is s if t is the empty string. Otherwise, it
  ; is the string obtained from s by replacing in one pass 
  ; all occurrences of t in s by t’, starting with the first occurrence 
  ; and proceeding in left-to-right order. 
  (declare-const replace_all (-> String String String String) :builtin 
    "- ⟦replace_all⟧(w)(w₁)(w₂) = w
       if ⟦contains⟧(w)(w₁) = false or w₁ = ε
     - ⟦replace_all⟧(w)(w₁)(w₂) = u₁w₂⟦replace_all⟧(u₂)(w₁)(w₂)
       otherwise
       where u₁ is the shortest word such that w = u₁w₁u₂")

  ; (replace_lan s l t) is the string obtained by replacing the
  ; shortest leftmost match of l in s, if any, by t.
  ; Note that if l contains the empty string, the result is to prepend t to s.
  (declare-const replace_lan (-> String Lan String String) :builtin 
    "- ⟦replace_lan⟧(w)(l)(u) = w₁uw₂
       where v is the shortest leftmost word from l in w such that w = w₁vw₂
     - ⟦replace_lan⟧(w)(l)(u) = w
       if no substring in w belongs to l")

  ; Language difference
  (declare-const diff (-> Lan Lan Lan) :builtin
    "⟦diff⟧(L₁)(L₂) = L₁ \ L₂"
    :left-assoc)

  ; Language complement
  (define-const comp (-> Lan Lan)  
    (lambda ((l Lan)) (diff univ l)))

  ; Language Kleene cross
  ; (+ e) abbreviates ee*.
  (define-const + (-> Lan Lan)  
    (lambda ((l Lan)) (++ l (* l))))

  ; Language option
  ; (opt e) abbreviates e ∪ {ε}
  (define-const opt (-> Lan Lan)
    (lambda ((l Lan)) (union l (to_lan ""))))

  ; Language range
  ; (range s₁ s₂) is the set of all *singleton* strings s such that
  ; (<= s₁ s s₂) provided s₁ and s₂ are singleton. Otherwise, it 
  ; is the empty language.
  (declare-const range (-> String String Lan) :builtin
    "- ⟦range⟧(w₁)(w₂) = { w ∈ UC | w₁ ⪯ w ⪯ w₂ } where ⪯ = ⟦<=⟧
       if |w₁| = |w₂| = 1
     - ⟦range⟧(w₁)(w₂) = ∅
       otherwise")

  (set-info :notes
    "⟦range⟧(⟦""ab""⟧, ⟦""c""⟧) = ∅
     ⟦range⟧(⟦""a""⟧, ⟦""bc""⟧) = ∅
     ⟦range⟧(⟦""c""⟧, ⟦""a""⟧)  = ∅")

  (set-info :notes
   "The arguments to range can be symbolic. This is intentional, 
    as in the case of to_re .")


  ; TODO change the order of the arguments of ^?

  ; Function symbol indexed by a numeral n.
  ; (^ n e) is the nth power of e:
  ; - (^ 0 e) = (to_re "") 
  ; - (^ n' e) = (++ e (^ n e))  where n' = n + 1
  ;   (^ n Lan Lan)
  (declare-const ^ (-> (! Int :syntax ⟨int_value⟩) Lan Lan) 
    :builtin
    "⟦^⟧(n)(L) = Lⁿ  where Lⁿ is defined inductively on n as follows:
      - L⁰ = { ε } 
      - Lⁿ⁺¹ = ⟦++⟧(L)(Lⁿ)"
  )

  ; Function symbol indexed by two numerals i and n.
  ; - (loop i n e) = empty                        if i > n
  ; - (loop i n e) = (^ n₁ e)                     if i = n
  ; - (loop i n e) = (union (^ i e) ... (^ n e))  if i < n
  (declare-const loop 
    (-> (! Int :var i :syntax ⟨int_value⟩ :restrict (>= i 0)) 
        (! Int :var n :syntax ⟨int_value⟩ :restrict (>= n 0))  
        Lan Lan)
    :builtin
    "⟦loop⟧(i)(n)(L) = Lⁱ ∪ ... ∪ Lⁿ   if i <= n       
     ⟦loop⟧(i)(n)(L) = ∅               otherwise")

  ; (is_digit s) is true iff s consists of a single character which is 
  ; a decimal digit, that is, a code point in the range 0x000030 ... 0x000039.
  (define-const is_digit (-> String Bool)
    (lambda ((s String))
      (or (= s (char #x000030))
        (= s (char #x000031))
        (= s (char #x000032))
        (= s (char #x000033))
        (= s (char #x000034))
        (= s (char #x000035))
        (= s (char #x000036))
        (= s (char #x000037))
        (= s (char #x000038))
        (= s (char #x000039)))))
;     ⟦is_digit⟧(w) = true  iff |w| = 1 and 0x000030 <= w <= 0x000039

  ;-----------------------------------------
  ; Bridging functions to and from integers
  ;-----------------------------------------

  ; (to_code s) is the code point of the only character of s, 
  ; if s is a singleton string; otherwise, it is -1. 
  (declare-const to_code (-> String Int) :builtin
    "- ⟦to_code⟧(w) = -1 if |w| ≠ 1
     - ⟦to_code⟧(w) = i  otherwise, where i is such that w = (char i)
                                    since w consists of a single code point")

  ; (from_code n) is the singleton string whose only character is
  ; code point n if n is in the range [0, 0xFFFFFF].
  ; Otherwise, it is the empty string.
  (declare-const from_code (-> Int String) :builtin
    "- ⟦from_code⟧(n) = n        if 0x000000 <= n <= 0xFFFFFF 
     - ⟦from_code⟧(n) = ε        otherwise")

  ; Conversion to integers
  ; (to_int s) with s consisting of digits (in the sense of is_digit)
  ; evaluates to the positive integer denoted by s when seen as a number in 
  ; base 10 (possibly with leading zeros).
  ; It evaluates to -1 if s is empty or contains non-digits. 
  (declare-const to_int (-> String Int) :builtin
    "- ⟦to_int⟧(w) = -1 if w = ⟦l⟧ 
       where l is the empty string literal or one containing anything other than
       digits, i.e., characters with code point outside the range
       0x000030–0x000039

     - ⟦to_int⟧(w) = n if w = ⟦l⟧  
       where l is a string literal consisting of a single digit denoting
       number n

     - ⟦to_int⟧(w) = 10*⟦to_int⟧(w₁) + ⟦to_int⟧(w₂) if 
         - w = w₁w₂
         - |w₁| > 0
         - |w₂| = 1
         - ⟦to_int⟧(w₁) >= 0
         - ⟦to_int⟧(w₂) >= 0")

  (set-info :notes
    "to_int is made total by mapping the empty word and words with
     non-digits to -1.

     The function returns a non-negative number also for words that start
     with (characters corresponding to) superfluous zeros, such as 
     ⟦""0023""⟧.")

  ; Conversion from integers.
  ; (from_int n) with n non-negative is the corresponding string in
  ; decimal notation, with no leading zeros. If n < 0, it is the empty string. 
  (declare-const from_int (-> Int String) :builtin
    "- ⟦from_int⟧(n) = w    if n > 0, where w is the shortest word
                                      such that ⟦to_int⟧(w) = n
     - ⟦from_int⟧(n) = ε    otherwise")

  (set-info :notes
    "from_int is made total by mapping negative integers to the empty word.

     ⟦to_int⟧(⟦from_int⟧(n)) = n iff n is a non-negative integer.

     ⟦from_int⟧(⟦to_int⟧(w)) = w iff w consists only of digits *and*
                                     has no leading zeros.")

  (set-info :notes
    "(to_int ""00123"") evaluates to 123.
     (from_int 123) evaluates to ""123"".
     (to_int ""-123"") evaluates to -1, an error value, not to -123.
     (from_int -123) evaluates to """", an error value, not to ""-123"".")
))

 
(raw file)

SMT-LIB 3 Logics

One of the goals of SMT-LIB 3 is to get rid of the notion of SMT-LIB logic (such as QF_UF, LIA, and so on) and provide users the ability, in effect, to construct logics on the fly by importing the right modules with the right restrictions at the beginning of an SMT-LIB script.

With some limitations, modules, interfaces and imports provide a suitable mechanism for indicating to an SMT solver a specific fragment in which to reason, allowing the solver to apply the best reasoning techniques at its disposal for that fragment, or recognize it as unsupported.

SMT-LIB logics, however, have been used historically also for another, orthogonal purpose: providing a way to index/structure the SMT-LIB benchmark library. Concurrently, they have also been used by the SMT-COMP competition as a convenient way to structure the competition in divisions. These uses of logics make them still necessary. So the set-logic will not go away anytime soon. However, in SMT-LIB 3 it becomes an abbreviation of a particular sequence of module declarations and imports.

For example, (set-logic QF_UF) could become an abbreviation of

(define-module QF_UF
 (
  (import (Core))
  (open Core) 
 )  
 :types (Bool)
 :consts (
   (true Bool)
   (false Bool)
   (ite (→ (! Type :var A :implicit) Bool A A A))
   (not (→ Bool Bool))
   (= (→ (! Type :var A :implicit) A A Bool))
   (distinct (→ (! Type :var A :implicit) A A Bool))
   (and (→ Bool Bool Bool))
   (or (→ Bool Bool Bool))
   (=> (→ Bool Bool Bool))
   (xor (→ Bool Bool Bool))
   ; quantifiers are not exported
 )
 :FOL ; only FOL syntax allowed
)
(import (QF_UF))
(open QF_UF)
...
The command (set-info :FOL) is a (imperfect) solution of the problem of capturing the Version 2.6 restriction for terms to be in the applicative fragment where all function symbols (i.e., non-nullary constants of type (→ τ₁ ⋅⋅⋅ τᵢ)) are fully applied (i,e., applied to exactly i arguments).

Similarly, for the logic LIA (set-logic LIA) would be and abbreviation of
(define-module LIA 
 (
  (import (Core Ints))
  (open Core)
  (open Ints)
 )
 :types (Bool Int)
 :consts (
    (<numeral> Int)
    (+ (→ Int Int Int))
    (- (→ Int Int Int))
    (* (→ (! Int :syntax <int_value>) (! Int :syntax <symbol>) Int)) ; linear mul.
    (* (→ (! Int :syntax <symbol>) (! Int :syntax <int_value>) Int)) ; linear mul.
    (> (→ Int Int Bool))
    (>= (→ Int Int Bool))
    (< (→ Int Int Bool))
    (<= (→ Int Int Bool))
    (true Bool)
    (false Bool)
    (ite (→ (! Type :var A :implicit) Bool A A A)) 
    (not (→ Bool Bool))
    (= (→ (! Type :var A :implicit) A A Bool))
    (distinct (→ (! Type :var A :implicit) A A Bool))
    (and (→ Bool Bool Bool))
    (or (→ Bool Bool Bool))
    (=> (→ Bool Bool Bool))
    (xor (→ Bool Bool Bool))
    (forall (→ (! Type :var A :implicit) (→ A Bool) Bool))
    (exists (→ (! Type :var A :implicit) (→ A Bool) Bool))
 )
 :FOL ; only FOL syntax allowed
) 
(import (LIA))
(open LIA)
Note that the linearity restriction does not need an ad-hoc attribute thanks to the restriction imposed on the type of the multiplication symbol * requiring one of its two arguments to be a concrete integer value.