Jump to content

Subtyping: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Tags: nowiki added Visual edit Mobile edit Mobile web edit
Undid revision 964835678 by 2804:214:8490:7434:1:1:2DF4:1FBC (talk)
Line 1: Line 1:
Polymorphism
{{Polymorphism}}
In programming language theorysubtypingalso subtype polymorphismorinclusion polymorphism is a form Polymorphism computer sciencetype polymorphism in which asubtypeis a datatype that is related to another datatype the supertype'by some notion Liskov substitution principlesubstitutability, meaning that program elements, typically subroutinesor functions, written to operate on elements the supertype can also operate on elements the subtype.If S is a subtype T,the subtyping binary relationrelation is often written S T, to mean that any term of type S can be safely used in a context where a term type T is expected. The precise semantics subtyping crucially depends on the particulars what safely used in a context where" means in a givenprogramming language. Thetype system a programming language essentially defines its own subtyping relation, which may well beidentity relation|trivial should the language support no or very littleconversion mechanisms.
In [[programming language theory]], '''subtyping''' (also '''subtype polymorphism''' or '''inclusion polymorphism''') is a form of [[Polymorphism (computer science)|type polymorphism]] in which a '''subtype''' is a [[datatype]] that is related to another datatype (the '''supertype''') by some notion of [[Liskov substitution principle|substitutability]], meaning that program elements, typically [[subroutines]] or functions, written to operate on elements of the supertype can also operate on elements of the subtype. If S is a subtype of T, the subtyping [[binary relation|relation]] is often written S <: T, to mean that any term of type S can be ''safely used in a context where'' a term of type T is expected. The precise semantics of subtyping crucially depends on the particulars of what "safely used in a context where" means in a given [[programming language]]. The [[type system]] of a programming language essentially defines its own subtyping relation, which may well be [[identity relation|trivial]] should the language support no (or very little) conversion mechanisms.

Due to the subtyping relation, a term may belong to more than one type. Subtyping is therefore a form of type polymorphism. In [[object-oriented programming]] the term 'polymorphism' is commonly used to refer solely to this ''subtype polymorphism'', while the techniques of [[parametric polymorphism]] would be considered ''[[generic programming]]''.

[[Functional programming languages]] often allow the subtyping of [[Record (computer science)|records]]. Consequently, [[simply typed lambda calculus]] extended with record types is perhaps the simplest theoretical setting in which a useful notion of subtyping may be defined and studied {{Citation needed|date=May 2017}}. Because the resulting calculus allows terms to have more than one type, it is no longer a "simple" [[type theory]]. Since functional programming languages, by definition, support [[function literals]], which can also be stored in records, records types with subtyping provide some of the features of object-oriented programming. Typically, functional programming languages also provide some, usually restricted, form of parametric polymorphism. In a theoretical setting, it is desirable to study the interaction of the two features; a common theoretical setting is [[system F-sub|system F<sub><:</sub>]]. Various calculi that attempt to capture the theoretical properties of object-oriented programming may be derived from system F<sub><:</sub>.

The concept of subtyping is related to the linguistic notions of [[hyponymy]] and [[holonymy]]. It is also related to the concept of [[bounded quantification]] in mathematical logic. Subtyping should not be confused with the notion of (class or object) [[inheritance (computer science)|inheritance]] from object-oriented languages;{{sfn|Cook|Hill|Canning|1990}} subtyping is a relation between types (interfaces in object-oriented parlance) whereas inheritance is a relation between implementations stemming from a language feature that allows new objects to be created from existing ones. In a number of object-oriented languages, subtyping is called '''interface inheritance''',<!-- todo add the insightful example from Mitchell with stack/queue/dequeue in the body --> with inheritance referred to as ''implementation inheritance''.

== Origins ==
The notion of subtyping in programming languages dates back to the 1960s; it was introduced in [[Simula]] derivatives. The first formal treatments of subtyping were given by [[John C. Reynolds]] in 1980 who used [[category theory]] to formalize [[implicit conversion]]s, and [[Luca Cardelli]] (1985).<ref>Pierce, ch. 15 notes</ref>

The concept of subtyping has gained visibility (and synonymy with polymorphism in some circles) with the mainstream adoption of object-oriented programming. In this context, the principle of safe substitution is often called the [[Liskov substitution principle]], after [[Barbara Liskov]] who popularized it in a [[keynote]] address at a conference on object-oriented programming in 1987. Because it must consider mutable objects, the ideal notion of subtyping defined by Liskov and [[Jeannette Wing]], called [[behavioral subtyping]] is considerably stronger than what can be implemented in a [[type checker]]. (See [[#Function types|Function types]] below for details.)

== Examples ==
[[Image:Inheritance.svg|thumb|right|350px|Example of subtypes: where bird is the supertype and all others are subtypes as denoted by the arrow in [[Unified Modeling Language|UML]] notation]]
A simple practical example of subtypes is shown in the diagram, right. The type "bird" has three subtypes "duck", "cuckoo" and "ostrich". Conceptually, each of these is a variety of the basic type "bird" that inherits many "bird" characteristics but has some specific differences. The [[Unified Modeling Language|UML]] notation is used in this diagram, with open-headed arrows showing the direction and type of the relationship between the supertype and its subtypes.

As a more practical example, a language might allow integer values to be used wherever floating point values are expected (<code>Integer</code> <: <code>Float</code>), or it might define a generic type <samp>Number</samp> as a common supertype of integers and the reals. In this second case, we only have <code>Integer</code> <: <code>Number</code> and <code>Float</code> <: <code>Number</code>, but <code>Integer</code> and <code>Float</code> are not subtypes of each other.

Programmers may take advantage of subtyping [[abstraction principle (programming)|to write code in a more abstract manner]] than would be possible without it. Consider the following example:

<syntaxhighlight lang="vb">
function max (x as Number, y as Number) is
if x < y then
return y
else
return x
end
</syntaxhighlight>

If integer and real are both subtypes of <code>Number</code>, and an operator of comparison with an arbitrary Number is defined for both types, then values of either type can be passed to this function. However, the very possibility of implementing such an operator highly constrains the Number type (for example, one can't compare an integer with a complex number), and actually only comparing integers with integers and reals with reals makes sense. Rewriting this function so that it would only accept 'x' and 'y' of the same type requires [[bounded polymorphism]].

== Subsumption ==

In type theory the concept of ''subsumption''<ref>Benjamin C. Pierce, ''Types and Programming Languages'', MIT Press, 2002, 15.1 "Subsumption", p. 181-182</ref> is used to define or evaluate whether a type '''S''' is a subtype of type '''T'''.

A type is a set of values. The set can be described ''extensionally'' by listing all the values, or it can be described ''intensionally'' by stating the membership of the set by a predicate over a domain of possible values. In common programming languages enumeration types are defined extensionally by listing values. User-defined types like records (structs, interfaces) or classes are defined intensionally by an explicit type declaration or by using an existing value, which encodes type information, as a prototype to be copied or extended.

In discussing the concept of subsumption, the set of values of a type is indicated by writing its name in mathematical italics: {{math|''T''}}. The type, viewed as a predicate over a domain, is indicated by writing its name in bold: '''T'''. The conventional symbol '''<:''' means "is a subtype of", and ''':>''' means "is a supertype of".

* A type '''T''' ''subsumes'' '''S''' if the set of values {{math|''T''}} which it defines, is a superset of the set {{math|''S''}}, so that every member of {{math|''S''}} is also a member of {{math|''T''}}.
* A type may be subsumed by more than one type: the supertypes of '''S''' intersect at {{math|''S''}}.
* If '''S <: T''' (and therefore {{math|''S''}} ⊆ {{math|''T'' }}), then '''T''', the predicate which circumscribes the set {{math|''T''}}, must be part of the predicate '''S''' (over the same domain) which defines {{math|''S''}}.
* If '''S''' subsumes '''T''', and '''T''' subsumes '''S''', then the two types are equal (although they may not be the same type if the type system distinguishes types by name).

A rule of thumb follow: a subtype is likely to be "bigger/wider/deeper" (its values hold more information) and "fewer/smaller" (the set of values is smaller) than one of its supertypes (which has more restricted information, and whose set of values are a superset of those of the subtype).

In the context of subsumption type definitions can be expressed using [[Set-builder notation]], which uses a predicate to define a set. Predicates can be defined, over a domain (set of possible values) {{math|''D''}}. Predicates are partial functions that compare values to selection criteria. For example: "is an integer value greater than or equal to 100 and less than 200?". If a value matches the criteria then the function returns the value. If not, the value is not selected, and nothing is returned. (List comprehensions are a form of this pattern used in many programming languages.)

If there are two predicates, <math>P_T</math> which applies selection criteria for the type '''T''', and <math>P_s</math> which applies additional criteria for the type '''S''', then sets for the two types can be defined:

:<math>T = \{v \in D \mid \ P_T(v)\}</math>

:<math>S =\{v \in D \mid \ P_T(v)\text{ and }P_s(v)\}</math>
The predicate '''T''' = <math>P_T</math> is applied alongside <math>P_s</math> as part of the compound predicate '''S''' defining {{math|''S''}}. The two predicates are ''conjoined'', so both must be true for a value to be selected. The predicate '''S''' = '''T''' <math>\land P_s</math> = <math>P_T \land P_s</math> subsumes the predicate '''T''', so '''S''' '''<:''' (subtypes) '''T'''.

For example: there is a subfamily of cat species called ''Felinae'', which is part of the family ''Felidae''. The genus ''Felis'', to which the domestic cat species ''Felis catus'' belongs, is part of that subfamily.

:<math>Felinae = \{cat \in Felidae \mid \ ofSubfamily(cat, felinaeSubfamilyName)\}</math>

:<math>Felis =\{cat \in Felinae \mid \ ofGenus(cat, felisGenusName)\}</math>

The conjunction of predicates has been expressed here through application of the second predicate over the domain of values conforming to the first predicate. Viewed as types, '''Felis <: Felinae <: Felidae'''.

If '''T''' subsumes '''S''' ('''T :> S''') then a procedure, function or expression given a value <math>s \in S</math> as an operand (input argument or term) will therefore be able to operate over that value as one of type '''T''', because <math>s \in T</math>. In the example above, we could expect the function ''ofSubfamily'' to be applicable to values of all three types '''Felidae''', '''Felinae''' and '''Felis'''.

== Subtyping schemes ==

Type theorists make a distinction between '''[[nominal type system|nominal subtyping]]''', in which only types declared in a certain way may be subtypes of each other, and '''[[structural type system|structural subtyping]]''', in which the structure of two types determines whether or not one is a subtype of the other. The class-based object-oriented subtyping described above is nominal; a structural subtyping rule for an object-oriented language might say that if objects of type ''A'' can handle all of the messages that objects of type ''B'' can handle (that is, if they define all the same [[Method (computer science)|method]]s), then ''A'' is a subtype of ''B'' regardless of whether either [[inheritance (computer science)|inherits]] from the other. This so-called ''[[duck typing]]'' is common in dynamically typed object-oriented languages. Sound structural subtyping rules for types other than object types are also well known.{{citation needed|date=June 2012}}

Implementations of programming languages with subtyping fall into two general classes: ''inclusive'' implementations, in which the representation of any value of type ''A'' also represents the same value at type ''B'' if ''A''<samp>&lt;:</samp>''B'', and ''coercive'' implementations, in which a value of type ''A'' can be ''automatically converted'' into one of type ''B''. The subtyping induced by subclassing in an object-oriented language is usually inclusive; subtyping relations that relate integers and floating-point numbers, which are represented differently, are usually coercive.

In almost all type systems that define a subtyping relation, it is reflexive (meaning ''A''<samp>&lt;:</samp>''A'' for any type ''A'') and transitive (meaning that if ''A''<samp>&lt;:</samp>''B'' and ''B''<samp>&lt;:</samp>''C'' then ''A''<samp>&lt;:</samp>''C''). This makes it a [[preorder]] on types.

== Record types ==

=== Width and depth subtyping ===

Types of [[Record (computer science)|records]] give rise to the concepts of ''width'' and ''depth'' subtyping. These express two different ways of obtaining a new type of record that allows the same operations as the original record type.

Recall that a record is a collection of (named) fields. Since a subtype is a type which allows all operations allowed on the original type, a record subtype should support the same operations on the fields as the original type supported.

One kind of way to achieve such support, called ''width subtyping'', adds more fields to the record. More formally, every (named) field appearing in the width supertype will appear in the width subtype. Thus, any operation feasible on the supertype will be supported by the subtype.

The second method, called ''depth subtyping'', replaces the various fields with their subtypes. That is, the fields of the subtype are subtypes of the fields of the supertype. Since any operation supported for a field in the supertype is supported for its subtype, any operation feasible on the record supertype is supported by the record subtype. Depth subtyping only makes sense for immutable records: for example, you can assign 1.5 to the 'x' field of a real point (a record with two real fields), but you can't do the same to the 'x' field of an integer point (which, however, is a deep subtype of the real point type) because 1.5 is not an integer (see [[Covariance and contravariance (computer science)|Variance]]).

Subtyping of records can be defined in [[System F-sub|System F<sub><:</sub>]], which combines [[parametric polymorphism]] with subtyping of record types and is a theoretical basis for many [[functional programming languages]] that support both features.

Some systems also support subtyping of labeled [[disjoint union]] types (such as [[algebraic data type]]s). The rule for width subtyping is reversed: every tag appearing in the width subtype must appear in the width supertype.

==Function types==

If ''T''<sub>1</sub> → ''T''<sub>2</sub> is a function type, then a subtype of it is any function type ''S''<sub>1</sub> → ''S''<sub>2</sub> with the property that ''T''<sub>1</sub> <: ''S''<sub>1</sub> and ''S''<sub>2</sub> <: ''T''<sub>2</sub>. This can be summarised using the following [[type rule|typing rule]]:


Due to the subtyping relation, a term may belong to more than one type. Subtyping is therefore a form type polymorphism. In object-oriented programmingthe term 'polymorphism' is commonly used to refer solely to this 'subtype polymorphism, while the techniques parametric polymorphism would be considered generic programming.
Functional programming languages often allow the subtyping Record computer sciencerecords Consequently, simply typed lambda calculus]] extended with record types is perhaps the simplest theoretical setting in which a useful notion of subtyping may be defined and studied Citation neededdate=May 2017. Because the resulting calculus allows terms to have more than one type, it is no longer a "simple type theory Since functional programming languages, by definition, support function literals, which can also be stored in records, records types with subtyping provide some of the features object-oriented programming. Typically, functional programming languages also provide some, usually restricted,form parametric polymorphism. In a theoretical setting, it is desirable to study the interaction the two features; a common theoretical setting is system F-subsystem F<sub><sub> Various calculi that attempt to capture the theoretical properties object-oriented programming may be derived from system F<sub><.
The concept subtyping is related to the linguistic notions hyponymyandholonymyIt is also related to the concept bounded quantification in mathematical logic Subtyping should not be confused with the notion class or objectinheritance computer scienceinheritance from object-oriented languages sfnCookHillCanning1990 subtyping is a relation between types interfaces in object-oriented parlance whereas inheritance is a relation between implementations stemming from a language feature that allows new objects to be created from existing ones. In a number object-oriented languages,subtyping is calledinterface inheritance<- todo add the insightful example from Mitchell with stackqueuedequeue in the body>< with inheritance referred to asimplementation inheritance
Origins
The notion subtyping in programming languages dates back to the 1960s; it was introduced in Simula derivatives. The first formal treatments of subtyping were given by John C. Reynolds in 1980 who used category theory to formalize implicit conversions, and Luca Cardelli1985<ref>Pierce, ch. 15 notes<nowiki><ref></nowiki>
The conceptsubtyping has gained visibility (and synonymy with polymorphism in some circles) with the mainstream adoption object-oriented programming. In this context, the principle safe substitution is often called the Liskov substitution principle after Barbara Liskov who popularized it in a keynote address at a conference on object-oriented programming in 1987. Because it must consider mutable objects, the ideal notion of subtyping defined by Liskov and Jeannette Wing,called behavioral subtyping is considerably stronger than what can be implemented in a type checker.See Function typesFunction types below for details.ExamplesImage:Inheritance.svgthumbright350pxExample subtypes: where bird is the supertype and all others are subtypes as denoted by the arrow in Unified Modeling LanguageUML notation
A simple practical example subtypes is shown in the diagram, right.The type bird" has three subtypes duck cuckoo and ostrich Conceptually each these is a variety the basic type "bird" that inherits many "bird" characteristics but has some specific differences. TheUnified Modeling Language UML notation is used in this diagram, with open-headed arrows showing the direction and type the relationship between the supertype and its subtypes.
As a more practical example, a language might allow integer values to be used wherever floating point values are expected<code>Integer<code>< <code>Float<code>or it might define a generic type <samp>Number<samp> as a common supertype integers and the reals. In this second case, we only have <code>Integer<code><: <code>Number<code>< and <code>Float<code><: <code>Number<code>< but <code>Integer<code>< and <code>Float<code>< are not subtypes each other.
Programmers may take advantage subtyping abstraction principle programmingto write code in a more abstract mannerthan would be possible without it. Consider the following example:
<nowiki><syntaxhighlight lang=vb></nowiki>
function max x as Number y as Numberisif x < y thenreturn yelsereturnend
<nowiki><syntaxhighlight></nowiki>
Ifinteger and real are both subtypes><code>Number<code> and an operator comparison with an arbitrary Number is defined for both types, then values either type can be passed to this function. However the very possibilityimplementing such an operator highly constrains the Number type for example, one can't compare an integer with a complex number and actually comparing integers with integers and reals with reals makes sense. Rewriting this function so that it would only accept x and y the same type requiresbounded polymorphism
Subsumption
In type theory the conceptsubsumption<nowiki><ref>Benjamin C. Pierce Types and Programming Languages MIT Press 2002 151 Subsumption p. 181182<ref>is used to define or evaluate whether a type S is a subtypetypeT</nowiki>
A type is a set values. The set can be described extensionally by listing all the values, or it can be described intensionally by stating the membership of the set by a predicate over a domain possible values.In common programming languages enumeration types are defined extensionally by listing values.User-defined types like records structs, interfaces or classes are defined intensionally by an explicit type declaration or by using an existing value, which encodes type information as a prototype to be copied or extended.
In discussing the concept subsumption, the set values a type is indicated by writing its name in mathematical italics:math T The type viewed as a predicate over a domain, is indicated by writing its name in bold: T The conventional symbol< means is a subtype and > means is a supertype A typeT subsume Sbif the set valuesmathT which it defines, is a superset the set math S so that every member mathS'is also a membermath T A type may be subsumed by more than one type:the supertypes S intersect atmathS
IfTand therefore mathS⊆ mathTthenTthe predicate which circumscribes the set mathTmust be partthe predicate Sover the same domain which definesmathSIfSsubsumesTand T subsumes Sthen the two types are equal although they may not be the same type if the type system distinguishes types by name
A rule thumb follow: a subtype is likely to bebiggerwiderdeeperits values hold more information and fewersmallerthe set values is smallerthan one its supertypes which has more restricted information and whose setvalues are a supersetthose the subtype).
In the context subsumption type definitions can be expressed using Set-builder notation which uses a predicate to define a set. Predicates can be defined, over a domain set possible values mathD. Predicates are partial functions that compare values to selection criteria. For example:is an integer value greater than or equal to 100 and less than 200 If a value matches the criteria then the function returns the value. If not, the value is not selected, and nothing is returned.List comprehensions are a form this pattern used in many programming languages.
If there are two predicates, <math>P_T<math> which applies selection criteria for the typeT and <math>P_s<math> which applies additional criteria for the type S then sets for the two types can be defined<math>T =vinDmid P_Tv<math><math>S=vin D mid P_TvtextandPsv<math>
The predicate T= <math>PT</math> is applied alongside <nowiki><math>P_s<math> as part the compound predicate S defining mathSThe two predicates are conjoined,so both must be true for a value to be selected.The predicateS=T<math>land P_s<math>=<math>PTland Ps<math>subsumes the predicateTsoS<subtypes T</nowiki>
For example:there is a subfamily cat species called Felinae which is part the family Felidae The genus Felis to which the domestic cat species Felis catus'' belongs is part that subfamily<nowiki><math>Felinae =catin Felidae midSubfamilycat felinaeSubfamilyName<math><math>Felis =catin Felinaemid Genuscat, felisGenusName<math></nowiki>''
The conjunction predicates has been expressed here through application the second predicate over the domain values conforming to the first predicate. Viewed as typesFelis <: Felinae<Felidae
IfTsubsumesST>Sthen a procedure function or expression given a value <nowiki><math>sin S<math> as an operand input argument or term will therefore be able to operate over that value as one typeT because<math>sin T<math>In the example above, we could expect the function Subfamily to be applicable to values all three types FelidaeFelinaeand FelisSubtyping schemes </nowiki>
Type theorists make a distinction between nominal type system nominal subtyping in which only types declared in a certain way may be subtypes each other,and structural type system structural subtypingin which the structure two types determines whether or not one is a subtype the other.The class-based object-oriented subtyping described above is nominal a structural subtyping rule for an object-oriented language might say that if objects type A can handle all the messages that objects type B can handle that is,if they define all the same Method computer sciencemethods then A is a subtype B regardless whether either inheritance computer scienceinheritsfrom the otherThis so-called duck typing is common in dynamically typed object-oriented languages. Sound structural subtyping rules for types other than object types are also wellcitation needed date=June 2012
Implementations programming languages with subtyping fall into two general classes: inclusive implementations, in which the representation of any value of type ''A'' also represents the same value at type BifA<samp>&lt<samp>'B' and coercive implementations, in which a valuetype Acan be automatically converted into one of type ''B''. The subtyping induced by subclassing in an object-oriented language is usually inclusive;subtyping relations that relate integers and floating-point numbers which are represented differently, are usually coercive.
In almost all type systems that define a subtyping relation, it is reflexive meaning A<samp>&lt<samp>Afor any type A'and transitive meaning that ifA<samp>&lt<samp>B and B<samp>&lt<samp>C then A<samp>&lt<samp>CThis makes it a preorder on types.
Record types
Width and depth subtyping
Types Record computer sciencerecordsgive rise to the concepts widt and depth subtyping.These express two different ways obtaining a new type record that allows the same operations as the original record type.
Recall that a record is a collection named fields. Since a subtype is a type which allows all operations allowed on the original type, a record subtype should support the same operations on the fields as the original type supported.
One kind way to achieve such support,called width subtyping,adds more fields to the record.More formally every named field appearing in the width supertype will appear in the width subtype. Thus any operation feasible on the supertype will be supported by the subtype.
The second method,
called ''depth subtyping replaces the various fields with their subtypes. That is, the fields the subtype are subtypes the fields of the supertype. Since any operation supported for a field in the supertype is supported for its subtype, any operation feasible on the record supertype is supported by the record subtype. Depth subtyping only makes sense for immutable records: for example you can assign 1.5 to the field of a real point (a record with two real fields), but you can't do the same to thefield of an integer point (which, however,is a deep subtype the real point type) because 1.5 is not an integer see Covariance and contravariance computer scienceVariance.''
Subtyping records can be defined in System FsubSystem F<sub><<sub> which combines parametric polymorphism with subtyping record types and is a theoretical basis for many functional programming languagesthat support both features.
Some systems also support subtyping labeled disjoint uniontypes such as algebraic data types.The rule for width subtyping is reversed: every tag appearing in the width subtype must appear in the width supertype.Function types
If T<sub>1<sub> → T<sub>2<sub> is a function type then a subtype of it is any function type S<sub>1<sub>S<sub>2<sub> with the property that><sub>1<sub><sub>1<sub>< and <sub>2<sub><<sub>2<sub>.This can be summarised using the following type ruletyping rule
<blockquote>
<blockquote>
<nowiki><math>T1 leq: S1quad S2 leq: T2over</nowiki>
<math>T_1 \leq: S_1 \quad S_2 \leq: T_2
\over
S1 rightarrow S2 leq: T1 rightarrow T2
S_1 \rightarrow S_2 \leq: T_1 \rightarrow T_2
<nowiki><math></nowiki>
</math>
<blockquote>
</blockquote>
The argument typeS<sub>1<sub>S<sub>2<sub> is said to be Covariance and contravariance computer science contravariant because the subtyping relation is reversed for it,whereas the return type is Covariance and contravariance computer sciencecovariant.Informally this reversal because the refined type is more liberal in the types it accepts and "more conservative in the type it returns.This is what exactly works inScalaprogramming_languageScala: a n-ary function is internally a class that inherits the FunctionN-A1A2-An B Trait computer programmingtrait which can be seen as a generalApplication programming interfaceinterfacein Java (programming language Java like languages), whereA1A2Anare the parameter types and Bis its return type before the type means the type is contravariant while + means covariant.

In languages that allow side effects like most objectoriented languages subtyping is generally not sufficient to guarantee that a function can be safely used in the context another Liskov's work in this area focused onbehavioral subtyping, which besides the type system safety discussed in this article also requires that subtypes preserve allInvariant computer science invariantsguaranteed by the supertypes in some Design by Contractcontract<nowiki><ref name=LSP>Barbara Liskov, Jeannette Wing http:citeseerxistpsueduviewdocsummary doi=1011391223 A behavioral notion subtyping ACM Transactions on Programming Languages and Systems,Volume Issue November1994 pp.18111841.An updated version appeared as CMU technical report:cite weburl=http:reports-archiveadmcscmueduanon1999CMU-CS-99156.pstitle=Behavioral Subtyping Using Invariants and Constraintslast=Liskovfirst=Barbaraauthorlink=Barbara Liskovauthor2=Wing, Jeannetteauthorlink2=Jeannette Wing date=July 1999format=PostScriptPSaccessdate=2006-10-05</nowiki></ref> This definition subtyping is generallyundecidable problem undecidable, so it cannot be verified by atype checker
The argument type of ''S''<sub>1</sub> → ''S''<sub>2</sub> is said to be [[Covariance and contravariance (computer science)|contravariant]] because the subtyping relation is reversed for it, whereas the return type is [[Covariance and contravariance (computer science)|covariant]]. Informally, this reversal occurs because the refined type is "more liberal" in the types it accepts and "more conservative" in the type it returns. This is what exactly works in [[Scala_(programming_language) |Scala]]: a ''n''-ary function is internally a class that inherits the '''FunctionN(-A1, -A2, …, -An, +B)''' [[Trait (computer programming) |trait]] (which can be seen as a general [[Application programming interface|interface]] in [[Java (programming language)|Java]]-like languages), where ''A1'', ''A2'', … ''An'' are the parameter types, and ''B'' is its return type; "'''-'''" before the type means the type is contravariant while "'''+'''" means covariant.
The subtyping Immutable objectmutable references is similar to the treatment function arguments and return values. Write-only references orsinksare contravariant like function arguments;

read-only references orsources are covariant, like return values. Mutable references which act as both sources and sinks are invariant.
In languages that allow side effects, like most object-oriented languages, subtyping is generally not sufficient to guarantee that a function can be safely used in the context of another. Liskov's work in this area focused on [[behavioral subtyping]], which besides the type system safety discussed in this article also requires that subtypes preserve all [[Invariant (computer science)|invariants]] guaranteed by the supertypes in some [[Design by Contract|contract]].<ref name="LSP">Barbara Liskov, Jeannette Wing, ''[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.1223 A behavioral notion of subtyping]'', ACM Transactions on Programming Languages and Systems, Volume 16, Issue 6 (November 1994), pp. 1811–1841. An updated version appeared as CMU technical report: {{cite web|url=http://reports-archive.adm.cs.cmu.edu/anon/1999/CMU-CS-99-156.ps|title=Behavioral Subtyping Using Invariants and Constraints|last=Liskov|first=Barbara|authorlink=Barbara Liskov|author2=Wing, Jeannette |authorlink2=Jeannette Wing |date=July 1999|format=[[PostScript|PS]]|accessdate=2006-10-05}}</ref> This definition of subtyping is generally [[undecidable problem|undecidable]], so it cannot be verified by a [[type checker]].
Relationship with inheritance

Subtyping and inheritance are independent orthogonalrelationships. They may coincide, but none is a special case the other. In other words, between two types ''S'' and ''T'', all combinations of subtyping and inheritance are possible:
The subtyping of [[Immutable object|mutable reference]]s is similar to the treatment of function arguments and return values. Write-only references (or ''sinks'') are contravariant, like function arguments; read-only references (or ''sources'') are covariant, like return values. Mutable references which act as both sources and sinks are invariant.
# ''S'' is neither a subtype nor a derived type T''

# ''S'' is a subtype but is not a derived type'T''
==Relationship with inheritance==
# ''S'' is not a subtype but is a derived type T''

# ''S'' is both a subtype and a derived type T''
Subtyping and inheritance are independent (orthogonal) relationships. They may coincide, but none is a special case of the other. In other words, between two types ''S'' and ''T'', all combinations of subtyping and inheritance are possible:
# ''S'' is neither a subtype nor a derived type of ''T''
# ''S'' is a subtype but is not a derived type of ''T''
# ''S'' is not a subtype but is a derived type of ''T''
# ''S'' is both a subtype and a derived type of ''T''
The first case is illustrated by independent types, such as <code>Boolean</code> and <code>Float</code>.
The first case is illustrated by independent types, such as <code>Boolean</code> and <code>Float</code>.

The second case can be illustrated by the relationship between <code>Int32<code> and <code>Int64<code>. In most object oriented programming languages, <code>Int64<code> are unrelated by inheritance to <code>Int32<code>. However <code>Int32<code> can be considered a subtype<code>Int64<code> since any 32 bit integer value can be promoted into a 64 bit integer value.
The second case can be illustrated by the relationship between <code>Int32</code> and <code>Int64</code>. In most object oriented programming languages, <code>Int64</code> are unrelated by inheritance to <code>Int32</code>. However <code>Int32</code> can be considered a subtype of <code>Int64</code> since any 32 bit integer value can be promoted into a 64 bit integer value.
The third case is a consequence oSubtyping_functions function subtyping input contravariance. Assume a super class type having a method returning an object the same type ie the type m isT T also note that the first argument m is this/self) and a derived class type from By inheritance, . By bottom-up application the function subtyping rule, this means: which is only possible if ''S and T are the same. Since inheritance is an irreflexive relation Scan't be a subtype T.

Subtyping and inheritance are compatible when all inherited fields and methods the derived type have types which are subtypes the corresponding fields and methods from the inherited typesfnCookHillCanning1990Coercion
The third case is a consequence of [[Subtyping_of_functions|function subtyping input contravariance]]. Assume a super class of type ''T'' having a method ''m'' returning an object of the same type (''i.e.'' the type of ''m'' is ''T → T'', also note that the first argument of ''m'' is this/self) and a derived class type ''S'' from ''T''. By inheritance, the type of ''m'' in ''S'' is ''S → S''. In order for ''S'' to be a subtype of ''T'' the type of ''m'' in ''S'' must be a subtype of the type of ''m'' in ''T'', in other words: ''S → S ≤: T → T''. By bottom-up application of the function subtyping rule, this means: ''S ≤: T'' and ''T ≤: S'', which is only possible if ''S'' and ''T'' are the same. Since inheritance is an irreflexive relation, ''S'' can't be a subtype of ''T''.
In coercive subtyping systems, subtypes are defined by implicit type conversion functions from subtype to supertype. For each subtyping relationship S<T a coercion function coerce S → T is provided, and any object s type Sis regarded as the objectcoerce<sub>S→ T<sub>(s typeTA coercion function may be defined by composition: if U then s may be regarded as an object type u under the compound coercion coerce<sub>T→ U<sub>∘coerce<sub>S →T<sub>The type coercion from a type to itselfcoerce<sub>TT<sub> is the identity functionid<sub>T<sub>

Coercion functions for records and disjoint unionsubtypes may be defined componentwise;in the case width-extended records type coercion simply discards any components which are not defined in the supertype. The type coercion for function types may be given byfs =coerce<sub>S<sub>2<sub>T<sub>2<sub></sub>fcoerce<sub>T<sub>1</sub>S<sub>1<sub><subtreflecting theCovariance and contravariance computer science contravariance function arguments and covariance return values.
Subtyping and inheritance are compatible when all inherited fields and methods of the derived type have types which are subtypes of the corresponding fields and methods from the inherited type {{sfn|Cook|Hill|Canning|1990}}.
The coercion function is uniquely determined given the subtype and supertype.Thus when multiple subtyping relationships are defined, one must be careful to guarantee that all type coercions are coherent. For instance if an integer such as 2 int can be coerced to a floating point number say, 2.0 floatthen it is not admissible to coerce 21floatto 2

int because the compound coercioncoerce<sub<floatfloat<sub>given by
==Coercions==
coerce<sub>int float<
In coercive subtyping systems, subtypes are defined by implicit [[type conversion]] functions from subtype to supertype. For each subtyping relationship (''S'' <: ''T''), a coercion function ''coerce'': ''S'' → ''T'' is provided, and any object ''s'' of type ''S'' is regarded as the object ''coerce''<sub>''S'' → ''T''</sub>(''s'') of type ''T''. A coercion function may be defined by composition: if ''S'' <: ''T'' and ''T'' <: ''U'' then ''s'' may be regarded as an object of type ''u'' under the compound coercion (''coerce''<sub>''T'' → ''U''</sub> ∘ ''coerce''<sub>''S'' → ''T''</sub>). The [[type coercion]] from a type to itself ''coerce''<sub>''T'' → ''T''</sub> is the [[identity function]] ''id''<sub>T</sub>
sub>coerce<sub>float int<sub> would then be distinct from the identity coercionid<sub>float<sub>.

See also wikibooksAda ProgrammingType SystemSubtypeCovariance and contravariance computer scienceCovariance and contravarianceThecircle-ellipse problemfor the perils subtyping variable-types on the same basis as value-typesClass-based programming Top typeRefinement typeBehavioral subtyping
Coercion functions for records and [[disjoint union]] subtypes may be defined componentwise; in the case of width-extended records, type coercion simply discards any components which are not defined in the supertype. The type coercion for function types may be given by ''f'''(''s'') = ''coerce''<sub>''S''<sub>2</sub> → ''T''<sub>2</sub></sub>(''f''(''coerce''<sub>''T''<sub>1</sub> → ''S''<sub>1</sub></sub>(''t''))), reflecting the [[Covariance and contravariance (computer science)|contravariance]] of function arguments and covariance of return values.
=References Textbooks refbegin Benjamin C. PierceTypes and programming languages MIT Press 2002 ISBN 0-262-16209-1 chapter 15 subtyping record types 19.3 nominal vs structural types and subtyping, and 23.2 varieties polymorphism C. Szyperski D.

Gruntz S Murer Component software: beyond object-oriented programming 2nd ed Pearson Education 2002 ISBN 0-201-74572-0 pp.&nbsp;93–95 a high-level presentation aimed at programming language users refend Papers refbegin Cardelli Luca A semantics multiple inheritance.In G.Kahn, D. MacQueen, and G.Plotkin editors SemanticsData Types volume 173 Lecture Notes in Computer Science, pages 51–67. Springer-Verlag 1984. Full version in Information and Computation7623:138164 1988.Cite conference last1 = Cook first1 William R.last Hillfirst2 Walterlast3 Canning first3PeterSdoi = 1011459670996721title = Inheritance is not subtyping conference = Proc.17th ACM SIGPLAN-SIGACT Symp. on Principles Programming Languages POPL pages125–135 year1990isbn0-89791-343-4 ref=harvciteseerx = 10111028635ReynoldsJohn C Using category theory to design implicit conversions and generic operators. In N. D.Jones editor Proceedings the Aarhus Workshop Semantics-Directed Compiler Generation,number 94 in Lecture Notes in Computer Science.Springer-Verlag, January 1980.Also in Carl A. Gunter and John C. Mitchell, editors,Theoretical Aspects Object-Oriented Programming:Types,Semantics, and Language Design MIT Press 1994refenddata typesDEFAULTSORT:Subtype PolymorphismCategory:Data typesCategory:Polymorphismcomputer scienceCategory:Type theoryCategory:Object-oriented programming
The coercion function is uniquely determined given the subtype and [[supertype]]. Thus, when multiple subtyping relationships are defined, one must be careful to guarantee that all type coercions are coherent. For instance, if an integer such as 2 : ''int'' can be coerced to a floating point number (say, 2.0 : ''float''), then it is not admissible to coerce 2.1 : ''float'' to 2 : ''int'', because the compound coercion ''coerce''<sub>''float'' → ''float''</sub> given by ''coerce''<sub>''int'' → ''float''</sub> ∘ ''coerce''<sub>''float'' → ''int''</sub> would then be distinct from the identity coercion ''id''<sub>''float''</sub>.

== See also ==
{{wikibooks|Ada Programming|Type System|Subtypes}}
* [[Covariance and contravariance (computer science)|Covariance and contravariance]]
* The [[circle-ellipse problem]] (for the perils of subtyping variable-types on the same basis as value-types)
* [[Class-based programming]]
* [[Top type]]
* [[Refinement type]]
* [[Behavioral subtyping]]

==Notes==
{{reflist}}

== References ==
'''Textbooks'''
{{refbegin}}
* Benjamin C. Pierce, ''Types and programming languages'', MIT Press, 2002, {{ISBN|0-262-16209-1}}, chapter 15 (subtyping of record types), 19.3 (nominal vs. structural types and subtyping), and 23.2 (varieties of polymorphism)
* C. Szyperski, D. Gruntz, S. Murer, ''Component software: beyond object-oriented programming'', 2nd ed., Pearson Education, 2002, {{ISBN|0-201-74572-0}}, pp.&nbsp;93–95 (a high-level presentation aimed at programming language users)
{{refend}}
'''Papers'''
{{refbegin}}
* Cardelli, Luca. A semantics of multiple inheritance. In G. Kahn, D. MacQueen, and G. Plotkin, editors, Semantics of Data Types, volume 173 of Lecture Notes in Computer Science, pages 51–67. Springer-Verlag, 1984. Full version in Information and Computation, 76(2/3):138–164, 1988.
* {{Cite conference| last1 = Cook | first1 = William R. | last2 = Hill | first2 = Walter | last3 = Canning | first3 = Peter S. | doi = 10.1145/96709.96721 | title = Inheritance is not subtyping | conference = Proc. 17th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages (POPL)| pages = 125–135 | year = 1990 | isbn = 0-89791-343-4 | ref = harv| citeseerx = 10.1.1.102.8635}}
* Reynolds, John C. Using category theory to design implicit conversions and generic operators. In N. D. Jones, editor, Proceedings of the Aarhus Workshop on Semantics-Directed Compiler Generation, number 94 in Lecture Notes in Computer Science. Springer-Verlag, January 1980. Also in Carl A. Gunter and John C. Mitchell, editors, Theoretical Aspects of Object-Oriented Programming: Types, Semantics, and Language Design (MIT Press, 1994).
{{refend}}

== Further reading ==
{{refbegin}}
* [[John C. Reynolds]], ''Theories of programming languages'', Cambridge University Press, 1998, {{ISBN|0-521-59414-6}}, chapter 16.
* [[Martín Abadi]], [[Luca Cardelli]], ''A theory of objects'', Springer, 1996, {{ISBN|0-387-94775-2}}. Section 8.6 contrast the subtyping of records and objects.
{{refend}}

{{data types}}

{{DEFAULTSORT:Subtype Polymorphism}}
[[Category:Data types]]
[[Category:Polymorphism (computer science)]]
[[Category:Type theory]]
[[Category:Object-oriented programming]]

Revision as of 22:01, 27 June 2020

In programming language theory, subtyping (also subtype polymorphism or inclusion polymorphism) is a form of type polymorphism in which a subtype is a datatype that is related to another datatype (the supertype) by some notion of substitutability, meaning that program elements, typically subroutines or functions, written to operate on elements of the supertype can also operate on elements of the subtype. If S is a subtype of T, the subtyping relation is often written S <: T, to mean that any term of type S can be safely used in a context where a term of type T is expected. The precise semantics of subtyping crucially depends on the particulars of what "safely used in a context where" means in a given programming language. The type system of a programming language essentially defines its own subtyping relation, which may well be trivial should the language support no (or very little) conversion mechanisms.

Due to the subtyping relation, a term may belong to more than one type. Subtyping is therefore a form of type polymorphism. In object-oriented programming the term 'polymorphism' is commonly used to refer solely to this subtype polymorphism, while the techniques of parametric polymorphism would be considered generic programming.

Functional programming languages often allow the subtyping of records. Consequently, simply typed lambda calculus extended with record types is perhaps the simplest theoretical setting in which a useful notion of subtyping may be defined and studied [citation needed]. Because the resulting calculus allows terms to have more than one type, it is no longer a "simple" type theory. Since functional programming languages, by definition, support function literals, which can also be stored in records, records types with subtyping provide some of the features of object-oriented programming. Typically, functional programming languages also provide some, usually restricted, form of parametric polymorphism. In a theoretical setting, it is desirable to study the interaction of the two features; a common theoretical setting is system F<:. Various calculi that attempt to capture the theoretical properties of object-oriented programming may be derived from system F<:.

The concept of subtyping is related to the linguistic notions of hyponymy and holonymy. It is also related to the concept of bounded quantification in mathematical logic. Subtyping should not be confused with the notion of (class or object) inheritance from object-oriented languages;[1] subtyping is a relation between types (interfaces in object-oriented parlance) whereas inheritance is a relation between implementations stemming from a language feature that allows new objects to be created from existing ones. In a number of object-oriented languages, subtyping is called interface inheritance, with inheritance referred to as implementation inheritance.

Origins

The notion of subtyping in programming languages dates back to the 1960s; it was introduced in Simula derivatives. The first formal treatments of subtyping were given by John C. Reynolds in 1980 who used category theory to formalize implicit conversions, and Luca Cardelli (1985).[2]

The concept of subtyping has gained visibility (and synonymy with polymorphism in some circles) with the mainstream adoption of object-oriented programming. In this context, the principle of safe substitution is often called the Liskov substitution principle, after Barbara Liskov who popularized it in a keynote address at a conference on object-oriented programming in 1987. Because it must consider mutable objects, the ideal notion of subtyping defined by Liskov and Jeannette Wing, called behavioral subtyping is considerably stronger than what can be implemented in a type checker. (See Function types below for details.)

Examples

Example of subtypes: where bird is the supertype and all others are subtypes as denoted by the arrow in UML notation

A simple practical example of subtypes is shown in the diagram, right. The type "bird" has three subtypes "duck", "cuckoo" and "ostrich". Conceptually, each of these is a variety of the basic type "bird" that inherits many "bird" characteristics but has some specific differences. The UML notation is used in this diagram, with open-headed arrows showing the direction and type of the relationship between the supertype and its subtypes.

As a more practical example, a language might allow integer values to be used wherever floating point values are expected (Integer <: Float), or it might define a generic type Number as a common supertype of integers and the reals. In this second case, we only have Integer <: Number and Float <: Number, but Integer and Float are not subtypes of each other.

Programmers may take advantage of subtyping to write code in a more abstract manner than would be possible without it. Consider the following example:

function max (x as Number, y as Number) is
  if x < y then
    return y
  else
    return x
end

If integer and real are both subtypes of Number, and an operator of comparison with an arbitrary Number is defined for both types, then values of either type can be passed to this function. However, the very possibility of implementing such an operator highly constrains the Number type (for example, one can't compare an integer with a complex number), and actually only comparing integers with integers and reals with reals makes sense. Rewriting this function so that it would only accept 'x' and 'y' of the same type requires bounded polymorphism.

Subsumption

In type theory the concept of subsumption[3] is used to define or evaluate whether a type S is a subtype of type T.

A type is a set of values. The set can be described extensionally by listing all the values, or it can be described intensionally by stating the membership of the set by a predicate over a domain of possible values. In common programming languages enumeration types are defined extensionally by listing values. User-defined types like records (structs, interfaces) or classes are defined intensionally by an explicit type declaration or by using an existing value, which encodes type information, as a prototype to be copied or extended.

In discussing the concept of subsumption, the set of values of a type is indicated by writing its name in mathematical italics: T. The type, viewed as a predicate over a domain, is indicated by writing its name in bold: T. The conventional symbol <: means "is a subtype of", and :> means "is a supertype of".

  • A type T subsumes S if the set of values T which it defines, is a superset of the set S, so that every member of S is also a member of T.
  • A type may be subsumed by more than one type: the supertypes of S intersect at S.
  • If S <: T (and therefore ST ), then T, the predicate which circumscribes the set T, must be part of the predicate S (over the same domain) which defines S.
  • If S subsumes T, and T subsumes S, then the two types are equal (although they may not be the same type if the type system distinguishes types by name).

A rule of thumb follow: a subtype is likely to be "bigger/wider/deeper" (its values hold more information) and "fewer/smaller" (the set of values is smaller) than one of its supertypes (which has more restricted information, and whose set of values are a superset of those of the subtype).

In the context of subsumption type definitions can be expressed using Set-builder notation, which uses a predicate to define a set. Predicates can be defined, over a domain (set of possible values) D. Predicates are partial functions that compare values to selection criteria. For example: "is an integer value greater than or equal to 100 and less than 200?". If a value matches the criteria then the function returns the value. If not, the value is not selected, and nothing is returned. (List comprehensions are a form of this pattern used in many programming languages.)

If there are two predicates, which applies selection criteria for the type T, and which applies additional criteria for the type S, then sets for the two types can be defined:

The predicate T = is applied alongside as part of the compound predicate S defining S. The two predicates are conjoined, so both must be true for a value to be selected. The predicate S = T = subsumes the predicate T, so S <: (subtypes) T.

For example: there is a subfamily of cat species called Felinae, which is part of the family Felidae. The genus Felis, to which the domestic cat species Felis catus belongs, is part of that subfamily.

The conjunction of predicates has been expressed here through application of the second predicate over the domain of values conforming to the first predicate. Viewed as types, Felis <: Felinae <: Felidae.

If T subsumes S (T :> S) then a procedure, function or expression given a value as an operand (input argument or term) will therefore be able to operate over that value as one of type T, because . In the example above, we could expect the function ofSubfamily to be applicable to values of all three types Felidae, Felinae and Felis.

Subtyping schemes

Type theorists make a distinction between nominal subtyping, in which only types declared in a certain way may be subtypes of each other, and structural subtyping, in which the structure of two types determines whether or not one is a subtype of the other. The class-based object-oriented subtyping described above is nominal; a structural subtyping rule for an object-oriented language might say that if objects of type A can handle all of the messages that objects of type B can handle (that is, if they define all the same methods), then A is a subtype of B regardless of whether either inherits from the other. This so-called duck typing is common in dynamically typed object-oriented languages. Sound structural subtyping rules for types other than object types are also well known.[citation needed]

Implementations of programming languages with subtyping fall into two general classes: inclusive implementations, in which the representation of any value of type A also represents the same value at type B if A<:B, and coercive implementations, in which a value of type A can be automatically converted into one of type B. The subtyping induced by subclassing in an object-oriented language is usually inclusive; subtyping relations that relate integers and floating-point numbers, which are represented differently, are usually coercive.

In almost all type systems that define a subtyping relation, it is reflexive (meaning A<:A for any type A) and transitive (meaning that if A<:B and B<:C then A<:C). This makes it a preorder on types.

Record types

Width and depth subtyping

Types of records give rise to the concepts of width and depth subtyping. These express two different ways of obtaining a new type of record that allows the same operations as the original record type.

Recall that a record is a collection of (named) fields. Since a subtype is a type which allows all operations allowed on the original type, a record subtype should support the same operations on the fields as the original type supported.

One kind of way to achieve such support, called width subtyping, adds more fields to the record. More formally, every (named) field appearing in the width supertype will appear in the width subtype. Thus, any operation feasible on the supertype will be supported by the subtype.

The second method, called depth subtyping, replaces the various fields with their subtypes. That is, the fields of the subtype are subtypes of the fields of the supertype. Since any operation supported for a field in the supertype is supported for its subtype, any operation feasible on the record supertype is supported by the record subtype. Depth subtyping only makes sense for immutable records: for example, you can assign 1.5 to the 'x' field of a real point (a record with two real fields), but you can't do the same to the 'x' field of an integer point (which, however, is a deep subtype of the real point type) because 1.5 is not an integer (see Variance).

Subtyping of records can be defined in System F<:, which combines parametric polymorphism with subtyping of record types and is a theoretical basis for many functional programming languages that support both features.

Some systems also support subtyping of labeled disjoint union types (such as algebraic data types). The rule for width subtyping is reversed: every tag appearing in the width subtype must appear in the width supertype.

Function types

If T1T2 is a function type, then a subtype of it is any function type S1S2 with the property that T1 <: S1 and S2 <: T2. This can be summarised using the following typing rule:

The argument type of S1S2 is said to be contravariant because the subtyping relation is reversed for it, whereas the return type is covariant. Informally, this reversal occurs because the refined type is "more liberal" in the types it accepts and "more conservative" in the type it returns. This is what exactly works in Scala: a n-ary function is internally a class that inherits the FunctionN(-A1, -A2, …, -An, +B) trait (which can be seen as a general interface in Java-like languages), where A1, A2, … An are the parameter types, and B is its return type; "-" before the type means the type is contravariant while "+" means covariant.

In languages that allow side effects, like most object-oriented languages, subtyping is generally not sufficient to guarantee that a function can be safely used in the context of another. Liskov's work in this area focused on behavioral subtyping, which besides the type system safety discussed in this article also requires that subtypes preserve all invariants guaranteed by the supertypes in some contract.[4] This definition of subtyping is generally undecidable, so it cannot be verified by a type checker.

The subtyping of mutable references is similar to the treatment of function arguments and return values. Write-only references (or sinks) are contravariant, like function arguments; read-only references (or sources) are covariant, like return values. Mutable references which act as both sources and sinks are invariant.

Relationship with inheritance

Subtyping and inheritance are independent (orthogonal) relationships. They may coincide, but none is a special case of the other. In other words, between two types S and T, all combinations of subtyping and inheritance are possible:

  1. S is neither a subtype nor a derived type of T
  2. S is a subtype but is not a derived type of T
  3. S is not a subtype but is a derived type of T
  4. S is both a subtype and a derived type of T

The first case is illustrated by independent types, such as Boolean and Float.

The second case can be illustrated by the relationship between Int32 and Int64. In most object oriented programming languages, Int64 are unrelated by inheritance to Int32. However Int32 can be considered a subtype of Int64 since any 32 bit integer value can be promoted into a 64 bit integer value.

The third case is a consequence of function subtyping input contravariance. Assume a super class of type T having a method m returning an object of the same type (i.e. the type of m is T → T, also note that the first argument of m is this/self) and a derived class type S from T. By inheritance, the type of m in S is S → S. In order for S to be a subtype of T the type of m in S must be a subtype of the type of m in T, in other words: S → S ≤: T → T. By bottom-up application of the function subtyping rule, this means: S ≤: T and T ≤: S, which is only possible if S and T are the same. Since inheritance is an irreflexive relation, S can't be a subtype of T.

Subtyping and inheritance are compatible when all inherited fields and methods of the derived type have types which are subtypes of the corresponding fields and methods from the inherited type [1].

Coercions

In coercive subtyping systems, subtypes are defined by implicit type conversion functions from subtype to supertype. For each subtyping relationship (S <: T), a coercion function coerce: ST is provided, and any object s of type S is regarded as the object coerceST(s) of type T. A coercion function may be defined by composition: if S <: T and T <: U then s may be regarded as an object of type u under the compound coercion (coerceTUcoerceST). The type coercion from a type to itself coerceTT is the identity function idT

Coercion functions for records and disjoint union subtypes may be defined componentwise; in the case of width-extended records, type coercion simply discards any components which are not defined in the supertype. The type coercion for function types may be given by f'(s) = coerceS2T2(f(coerceT1S1(t))), reflecting the contravariance of function arguments and covariance of return values.

The coercion function is uniquely determined given the subtype and supertype. Thus, when multiple subtyping relationships are defined, one must be careful to guarantee that all type coercions are coherent. For instance, if an integer such as 2 : int can be coerced to a floating point number (say, 2.0 : float), then it is not admissible to coerce 2.1 : float to 2 : int, because the compound coercion coercefloatfloat given by coerceintfloatcoercefloatint would then be distinct from the identity coercion idfloat.

See also

Notes

  1. ^ a b Cook, Hill & Canning 1990.
  2. ^ Pierce, ch. 15 notes
  3. ^ Benjamin C. Pierce, Types and Programming Languages, MIT Press, 2002, 15.1 "Subsumption", p. 181-182
  4. ^ Barbara Liskov, Jeannette Wing, A behavioral notion of subtyping, ACM Transactions on Programming Languages and Systems, Volume 16, Issue 6 (November 1994), pp. 1811–1841. An updated version appeared as CMU technical report: Liskov, Barbara; Wing, Jeannette (July 1999). "Behavioral Subtyping Using Invariants and Constraints" (PS). Retrieved 2006-10-05.

References

Textbooks

  • Benjamin C. Pierce, Types and programming languages, MIT Press, 2002, ISBN 0-262-16209-1, chapter 15 (subtyping of record types), 19.3 (nominal vs. structural types and subtyping), and 23.2 (varieties of polymorphism)
  • C. Szyperski, D. Gruntz, S. Murer, Component software: beyond object-oriented programming, 2nd ed., Pearson Education, 2002, ISBN 0-201-74572-0, pp. 93–95 (a high-level presentation aimed at programming language users)

Papers

  • Cardelli, Luca. A semantics of multiple inheritance. In G. Kahn, D. MacQueen, and G. Plotkin, editors, Semantics of Data Types, volume 173 of Lecture Notes in Computer Science, pages 51–67. Springer-Verlag, 1984. Full version in Information and Computation, 76(2/3):138–164, 1988.
  • Cook, William R.; Hill, Walter; Canning, Peter S. (1990). Inheritance is not subtyping. Proc. 17th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages (POPL). pp. 125–135. CiteSeerX 10.1.1.102.8635. doi:10.1145/96709.96721. ISBN 0-89791-343-4. {{cite conference}}: Invalid |ref=harv (help)
  • Reynolds, John C. Using category theory to design implicit conversions and generic operators. In N. D. Jones, editor, Proceedings of the Aarhus Workshop on Semantics-Directed Compiler Generation, number 94 in Lecture Notes in Computer Science. Springer-Verlag, January 1980. Also in Carl A. Gunter and John C. Mitchell, editors, Theoretical Aspects of Object-Oriented Programming: Types, Semantics, and Language Design (MIT Press, 1994).

Further reading