Type conversion: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Reverted edits by 185.159.218.180 (talk) (HG) (3.4.12)
 
(343 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Short description|Changing an expression from one data type to another}}
{{cleanup-date|January 2005}}
{{for|the aviation licensing process|Type conversion (aviation)}}
{{for multi|metal casting|Casting (metalworking)#Upcasting}}
{{multiple issues|
{{More citations needed|date=May 2011}}
{{original research|date=May 2018}}
}}
{{Use dmy dates|date=August 2021}}


In [[computer science]], '''type conversion''',<ref name=":0">{{Cite book|title=S. Chand's Computer Science|year=2008|isbn=978-8121929844|pages=81–83|last1=Mehrotra|first1=Dheeraj|publisher=S. Chand }}</ref><ref>{{Cite book|title=Programming Languages - Design and Constructs|year=2013|isbn=978-9381159415|pages=35|publisher=Laxmi Publications }}</ref> '''type casting''',<ref name=":0" /><ref name=":1">{{Cite book|title=Concise Encyclopedia of Computer Science|last=Reilly|first=Edwin|year=2004|isbn=0470090952|pages=[https://archive.org/details/conciseencyclope0000unse_v5u2/page/82 82, 110]|publisher=John Wiley & Sons |url=https://archive.org/details/conciseencyclope0000unse_v5u2/page/82}}</ref> '''type coercion''',<ref name=":1" /> and '''type juggling'''<ref>{{Cite book|title=Pro TypeScript: Application-Scale JavaScript Development|last=Fenton|first=Steve|year=2017|isbn=978-1484232491|pages=xxiii|publisher=Apress }}</ref><ref>{{Cite web|url=http://php.net/manual/en/language.types.type-juggling.php|title=PHP: Type Juggling - Manual|website=php.net|access-date=2019-01-27}}</ref> are different ways of changing an [[Expression (computer science)|expression]] from one [[data type]] to another. An example would be the conversion of an [[integer (computer science)|integer]] value into a [[floating point]] value or its textual representation as a [[string (computer science)|string]], and vice versa. Type conversions can take advantage of certain features of [[type hierarchy|type hierarchies]] or [[data representation]]s. Two important aspects of a type conversion are whether it happens ''implicitly'' (automatically) or ''explicitly'',<ref name=":0" /><ref name=":2">{{Cite book|title=C++ Quick Syntax Reference|last=Olsson|first=Mikael|year=2013|isbn=978-1430262770|pages=87–89|publisher=Apress }}</ref> and whether the underlying data representation is converted from one representation into another, or a given representation is merely ''reinterpreted'' as the representation of another data type.<ref name=":2"/><ref>{{Cite book|title=Computational Intelligence: A Methodological Introduction|isbn=978-1447172963|pages=269|last1=Kruse|first1=Rudolf|last2=Borgelt|first2=Christian|last3=Braune|first3=Christian|last4=Mostaghim|first4=Sanaz|last5=Steinbrecher|first5=Matthias|date=16 September 2016|publisher=Springer }}</ref> In general, both [[primitive data type|primitive]] and [[compound data type]]s can be converted.
{{mergefrom|typecast (programming)}}
{{Move to Wikibooks}}
[[wikibooks:Programming:C -/- -/-#Casting (Type)]] (the parts which are not a general description but a tutorial on C++.)


Each [[programming language]] has its own rules on how types can be converted. Languages with [[strong typing]] typically do little implicit conversion and discourage the reinterpretation of representations, while languages with [[weak typing]] perform many implicit conversions between data types. Weak typing language often allow forcing the [[compiler]] to arbitrarily interpret a data item as having different representations—this can be a non-obvious programming error, or a technical method to directly deal with underlying hardware.
= General =


In most languages, the word ''coercion'' is used to denote an ''implicit'' conversion, either during compilation or during [[run time (program lifecycle phase)|run time]]. For example, in an expression mixing integer and floating point numbers (like 5 + 0.1), the compiler will automatically convert integer representation into floating point representation so fractions are not lost. Explicit type conversions are either indicated by writing additional code (e.g. adding type identifiers or calling built-in [[Subroutine|routines]]) or by coding conversion routines for the compiler to use when it otherwise would halt with a type mismatch.
In [[computer science]], '''type conversion''' or '''typecasting''' refers to changing an entity of one [[datatype]] into another. There are two types of conversion: implicit and explicit. The terminology for implicit type conversion is '''[[#Implicit type conversion|coercion]]'''. Explicit type conversion is known as '''[[#Explicit type conversion|cast]]'''.


In most [[ALGOL]]-like languages, such as [[Pascal (programming language)|Pascal]], [[Modula-2]], [[Ada (programming language)|Ada]] and [[Delphi (programming language)|Delphi]], ''conversion'' and ''casting'' are distinctly different concepts. In these languages, ''conversion'' refers to either implicitly or explicitly changing a value from one data type storage format to another, e.g. a 16-bit integer to a 32-bit integer. The storage needs may change as a result of the conversion, including a possible loss of precision or truncation. The word ''cast'', on the other hand, refers to explicitly changing the ''interpretation'' of the ''bit pattern'' representing a value from one type to another. For example, 32 contiguous bits may be treated as an array of 32 booleans, a 4-byte string, an unsigned 32-bit integer or an IEEE single precision floating point value. Because the stored bits are never changed, the programmer must know low level details such as representation format, byte order, and alignment needs, to meaningfully cast.
= Implicit type conversion =
Implicit type conversion, also known as '''coercion''', is an automatic type conversion by the [[compiler]]. Some [[programming language|language]]s allow, or even require compilers to provide coercion.


In the C family of languages and [[ALGOL 68]], the word ''cast'' typically refers to an ''explicit'' type conversion (as opposed to an implicit conversion), causing some ambiguity about whether this is a re-interpretation of a bit-pattern or a real data representation conversion. More important is the multitude of ways and rules that apply to what data type (or class) is located by a pointer and how a pointer may be adjusted by the compiler in cases like object (class) inheritance.
In a mixed type expression, a [[Subtype and derived type|subtype]] ''s'' will be converted to a [[supertype]] ''t'' or some subtypes ''s<sub>1</sub>'', ''s<sub>2</sub>'', ... will be converted to a supertype ''t'' (maybe none of the ''s<sub>i</sub>'' is of type ''t'') at [[runtime]] so that the program will run correctly. For example:


== Explicit casting in various languages ==
double d;
long l;
int i;<br>
if (d > i) d = i;
if (i > l) l = i;
if (d == l) d *= 2;


=== Ada ===
is legal in a [[C programming language|C language]] program. Although ''d'', ''l'' and ''i'' belong to different datatypes, they will be automatically converted to the same datatype each time a comparison or assignment is executed.
[[Ada (programming language)|Ada]] provides a generic library function Unchecked_Conversion.<ref>{{Cite web |title=Unchecked Type Conversions |url=https://www.adaic.org/resources/add_content/standards/95lrm/ARM_HTML/RM-13-9.html |access-date=2023-03-11 |website=Ada Information Clearinghouse}}</ref>


===C-like languages===
= Explicit type conversion =
There are several types conversion.


====Implicit type conversion====
; checked : before the conversion is performed a runtime check done to see if the destination type can actually hold the source value. If not an error condition is raised.
Implicit type conversion, also known as ''coercion'' or ''type juggling'', is an automatic type conversion by the [[compiler]]. Some [[programming language]]s allow compilers to provide coercion; others require it.
; unchecked : no check is perfomed and when the destination type can not hold the source value the result is undefined.
; bit pattern : The data is not interpreted at all and just the raw bit pattern is copied.


In a mixed-type expression, data of one or more [[Subtyping|subtype]]s can be [[Operators in C and C++|converted]] to a supertype as needed at [[Run time (program lifecycle phase)|runtime]] so that the program will run correctly. For example, the following is legal [[C (programming language)|C language]] code:
Each [[programming language]] has its own rules on how types can be converted. In general, both objects and fundamental data types can be ''converted''.


<syntaxhighlight lang="c">
== in Ada ==
double d;
{{Wikibooks Chapter|Ada Programming|Subtypes|Converting Data}}
long l;
Ada supports all three conversion techniques and a few releated techniques. On [[Wikibooks]] there is a full article on how to use them. It might be worth reading even if you plan to use another programming language.
int i;


== in C/C++ ==
if (d > i) d = i;
if (i > l) l = i;
if (d == l) d *= 2;
</syntaxhighlight>


Although {{mono|'''d'''}}, {{mono|'''l'''}}, and {{mono|'''i'''}} belong to different data types, they will be automatically converted to equal data types each time a comparison or assignment is executed. This behavior should be used with caution, as [[unintended consequences]] can arise. Data can be lost when converting representations from floating-point to integer, as the fractional components of the floating-point values will be truncated (rounded toward zero). Conversely, precision can be lost when converting representations from integer to floating-point, since a floating-point type may be unable to exactly represent all possible values of some integer type. For example, {{C-lang|float}} might be an [[IEEE 754]] single precision type, which cannot represent the integer 16777217 exactly, while a 32-bit integer type can. This can lead to unintuitive behavior, as demonstrated by the following code:
A '''cast''', or ''explicit type conversion'', is special programming instuction which specifies what [[data type]] to treat a [[Variable#Computer_programming|variable]] as (or an intermediate calculation result) in a given expression.


<!-- an int is within - 32767(?) and 32767, isn't it ?! -->
Casting will ignore "extra" information (but never adds information to the type being casted). The C/C++ '''cast''' is either "unchecked" or "bit pattern".
<syntaxhighlight lang="c">
#include <stdio.h>


int main(void)
As an example with fundamental data types, a [[fixed-point arithmetic|fixed-point float]] could be cast as an [[integer]], where the data beyond the decimal (or binary) point is ignored. Alternatively, an integer could be cast as a float if, for example, a function call required a floating point type (but, as noted, no information is really added - 1 would become 1.0000000).
{
int i_value = 16777217;
float f_value = 16777216.0;
printf("The integer is: %d\n", i_value);
printf("The float is: %f\n", f_value);
printf("Their equality: %d\n", i_value == f_value);
}
</syntaxhighlight>


On compilers that implement floats as IEEE single precision, and ints as at least 32 bits, this code will give this peculiar print-out:
Object casting works in a similar way. A [[subclass]] can be cast as a parent type, where the "extra" information that makes it a subclass is ignored, and only the parts inherited from the parent are treated. For example, a triangle class derived from a shape class could be cast as a shape.


The integer is: 16777217
=== Two common casting styles ===
The float is: 16777216.000000
Their equality: 1


Note that 1 represents equality in the last line above. This odd behavior is caused by an implicit conversion of {{C-lang|i_value}} to float when it is compared with {{C-lang|f_value}}. The conversion causes loss of precision, which makes the values equal before the comparison.
There are two common casting styles, each outlined below.


Important takeaways:
==== C-style casting ====


# {{C-lang|float}} to {{C-lang|int}} causes [[truncation]], i.e., removal of the fractional part.
This style of casting is used in [[C programming language|C]] and [[Java programming language|Java]]. It follows the form:
# {{C-lang|double}} to {{C-lang|float}} causes rounding of digit.
# {{C-lang|long}} to {{C-lang|int}} causes dropping of excess higher order bits.


=====Type promotion=====
(type)expression
One special case of implicit type conversion is type promotion, where an object is automatically converted into another data type representing a [[superset]] of the original type. Promotions are commonly used with types smaller than the native type of the target platform's [[arithmetic logic unit]] (ALU), before arithmetic and logical operations, to make such operations possible, or more efficient if the ALU can work with more than one type. C and C++ perform such promotion for objects of boolean, character, wide character, enumeration, and short integer types which are promoted to int, and for objects of type float, which are promoted to double. Unlike some other type conversions, promotions never lose precision or modify the value stored in the object.


In [[Java (programming language)|Java]]:
==== C++-style casting ====
<syntaxhighlight lang="java">
int x = 3;
double y = 3.5;
System.out.println(x + y); // The output will be 6.5
</syntaxhighlight>


====Explicit type conversion====
Several cast syntaxes are used in [[C plus plus|C++]] (although C-style casting is supported as well). The function-call style follows the form:
Explicit type conversion, also called type casting, is a type conversion which is explicitly defined within a program (instead of being done automatically according to the rules of the language for implicit type conversion). It is requested by the user in the program.


<syntaxhighlight lang="cpp">
type(expression)
double da = 3.3;
double db = 3.3;
double dc = 3.4;
int result = (int)da + (int)db + (int)dc; // result == 9
// if implicit conversion would be used (as with "result = da + db + dc"), result would be equal to 10
</syntaxhighlight>


There are several kinds of explicit conversion.
This style of casting was adopted to force clarity when using casting. For example, the result of, and intention of, the C style cast


; checked: Before the conversion is performed, a runtime check is done to see if the destination type can hold the source value. If not, an error condition is raised.
(type)firstVariable + secondVariable
; unchecked: No check is performed. If the destination type cannot hold the source value, the result is undefined.
; bit pattern: The raw bit representation of the source is copied verbatim, and it is re-interpreted according to the destination type. This can also be achieved via [[Aliasing (computing)|aliasing]].


In [[object-oriented programming]] languages, objects can also be [[downcasting|downcast]] : a reference of a base class is cast to one of its derived classes.
may not be clear, while the same cast using C++-style casting allows more clarity:


===C# and C++===
type(firstVariable + secondVariable)
In [[C Sharp (programming language)|C#]], type conversion can be made in a safe or unsafe (i.e., C-like) manner, the former called ''checked type cast''.<ref>
or
{{cite web
type(firstVariable) + secondVariable
| access-date = 4 August 2011
| date = 25 March 2002
| first = Hanspeter
| last = Mössenböck
| page = 5
| publisher = Institut für Systemsoftware, Johannes Kepler Universität Linz, Fachbereich Informatik
| title = Advanced C#: Checked Type Casts
| url = http://ssw.jku.at/Teaching/Lectures/CSharp/Tutorial/Part2.pdf}} at [http://ssw.jku.at/Teaching/Lectures/CSharp/Tutorial/ C# Tutorial]
</ref>


<syntaxhighlight lang="csharp">
Later in the evolution of C++, the following more explicit casts were added to the language to clarify the programmer's intent even further:
Animal animal = new Cat();


Bulldog b = (Bulldog) animal; // if (animal is Bulldog), stat.type(animal) is Bulldog, else an exception
static_cast<type>(value_to_cast)
b = animal as Bulldog; // if (animal is Bulldog), b = (Bulldog) animal, else b = null
dynamic_cast<type>(value_to_cast)
const_cast<type>(value_to_cast)
reinterpret_cast<type>(value_to_cast)


animal = null;
Static casts converts type-compatible values. For instance the following:
b = animal as Bulldog; // b == null
</syntaxhighlight>


In [[C++]] a similar effect can be achieved using ''C++-style cast syntax''.
double myDouble = 3.0;
int myInt = static_cast<int>(myDouble);


<syntaxhighlight lang="cpp">
converts the double-precision [[floating point]] value <code>myDouble</code> (3.0) to the corresponding integer value (3). Static casts can be dangerous:
Animal* animal = new Cat;


Bulldog* b = static_cast<Bulldog*>(animal); // compiles only if either Animal or Bulldog is derived from the other (or same)
YourClass * pYour = GimmeAnObject();
b = dynamic_cast<Bulldog*>(animal); // if (animal is Bulldog), b = (Bulldog*) animal, else b = nullptr
void * pv = pYour; // no cast needed.
MyClass * p = static_cast<MyClass *>(pYour); // MyClass had better be related to YourClass...
p->SomeMethod(); // ...or this might blow up in a nasty way.


Bulldog& br = static_cast<Bulldog&>(*animal); // same as above, but an exception will be thrown if a nullptr was to be returned
Static casts on pointers or references don't verify that the pointed-to object is type-compatible to the new type.
// this is not seen in code where exception handling is avoided


delete animal; // always free resources
A dynamic cast is safer than a static cast in this scenario: it is compiled by the compiler into a call to the C++ [[runtime|runtime library]] where a check is made to ensure legal casts. This is analogous to the casts in [[Java programming language|Java]].
animal = nullptr;
b = dynamic_cast<Bulldog*>(animal); // b == nullptr
</syntaxhighlight>


===Eiffel===
YourClass * pYour = GimmeAnObject();
void * pv = pYour; // no cast needed.
MyClass * p = dynamic_cast<MyClass *>(pYour); // This won't blow up in the same way
if (p != 0)
p->SomeMethod(); // C++ guarantees p points to a MyClass


In [[Eiffel (programming language)|Eiffel]] the notion of type conversion is integrated into the rules of the type system. The Assignment Rule says that an assignment, such as:
Dynamic casts on pointers return a null pointer if cast value is type incompatible. Dynamic casts on a reference throw a type [[exception handling|exception]].
<syntaxhighlight lang="eiffel">
x := y
</syntaxhighlight>
is valid if and only if the type of its source expression, <code lang="eiffel">y</code> in this case, is ''compatible with'' the type of its target entity, <code lang="eiffel">x</code> in this case. In this rule, ''compatible with'' means that the type of the source expression either ''conforms to'' or ''converts to'' that of the target. Conformance of types is defined by the familiar rules for [[polymorphism in object-oriented programming]]. For example, in the assignment above, the type of <code lang="eiffel">y</code> conforms to the type of <code lang="eiffel">x</code> if the class upon which <code lang="eiffel">y</code> is based is a descendant of that upon which <code lang="eiffel">x</code> is based.


====Definition of type conversion in Eiffel====
A const cast casts away the [[const correctness|constness]] of an object, returning a non-const reference to the same object. This allows modifications to objects that normally would be treated read-only by the compiler:
The actions of type conversion in Eiffel, specifically ''converts to'' and ''converts from'' are defined as:


<blockquote>
const MyClass * cantTouchThis = CreateConstObject();
A type based on a class CU ''converts to'' a type T based on a class CT (and T ''converts from'' U) if either
cantTouchThis->constant_value = 41; // compile-time error.
:CT has a ''conversion procedure'' using U as a conversion type, or
const_cast<MyClass *>(cantTouchThis)->constant_value = 42; // compiles, but who knows what happens at runtime?
:CU has a ''conversion query'' listing T as a conversion type
</blockquote>


====Example====
The reinterpret cast is the most notorious one in C++. It allows the reinterpretation of the raw bit pattern of the value to be cast, disregarding the type system completely. For example, it allows the casting of an arbitrary integer to a pointer to an object:


Eiffel is a fully compliant [[List of CLI languages|language]] for Microsoft [[.NET Framework]]. Before development of .NET, Eiffel already had extensive class libraries. Using the .NET type libraries, particularly with commonly used types such as strings, poses a conversion problem. Existing Eiffel software uses the string classes (such as <code lang="eiffel">STRING_8</code>) from the Eiffel libraries, but Eiffel software written for .NET must use the .NET string class (<code lang="eiffel">System.String</code>) in many cases, for example when calling .NET methods which expect items of the .NET type to be passed as arguments. So, the conversion of these types back and forth needs to be as seamless as possible.
MyClass * pclass = reinterpret_cast<MyClass *>(0xDEADBEEF); // I know what I'm doing
pclass->some_field = 3.14159; // very unsafe indeed


<syntaxhighlight lang="eiffel">
Opinions were divided when these verbose casts were introduced into the language. [[Detractor]]s argued the new syntax was 'ugly', while [[supporter]]s claimed that since casting is such an 'ugly' activity to begin with, it should be highlit with an 'ugly' syntax to alert [[programmer]]s. Another perceived advantage is the ease with which verbose casts can be located in [[source code]] using [[programming tools]] like [[grep]].
my_string: STRING_8 -- Native Eiffel string
my_system_string: SYSTEM_STRING -- Native .NET string


...

my_string := my_system_string
</syntaxhighlight>

In the code above, two strings are declared, one of each different type (<code lang="eiffel">SYSTEM_STRING</code> is the Eiffel compliant alias for System.String). Because <code lang="eiffel">System.String</code> does not conform to <code lang="eiffel">STRING_8</code>, then the assignment above is valid only if <code lang="eiffel">System.String</code> converts to <code lang="eiffel">STRING_8</code>.

The Eiffel class <code lang="eiffel">STRING_8</code> has a conversion procedure <code lang="eiffel">make_from_cil</code> for objects of type <code lang="eiffel">System.String</code>. Conversion procedures are also always designated as creation procedures (similar to constructors). The following is an excerpt from the <code lang="eiffel">STRING_8</code> class:

<syntaxhighlight lang="eiffel">
class STRING_8
...
create
make_from_cil
...
convert
make_from_cil ({SYSTEM_STRING})
...
</syntaxhighlight>

The presence of the conversion procedure makes the assignment:

<syntaxhighlight lang="eiffel">
my_string := my_system_string
</syntaxhighlight>

semantically equivalent to:

<syntaxhighlight lang="eiffel">
create my_string.make_from_cil (my_system_string)
</syntaxhighlight>

in which <code lang="eiffel">my_string</code> is constructed as a new object of type <code lang="eiffel">STRING_8</code> with content equivalent to that of <code lang="eiffel">my_system_string</code>.

To handle an assignment with original source and target reversed:

<syntaxhighlight lang="eiffel">
my_system_string := my_string
</syntaxhighlight>

the class <code lang="eiffel">STRING_8</code> also contains a conversion query <code lang="eiffel">to_cil</code> which will produce a <code lang="eiffel">System.String</code> from an instance of <code lang="eiffel">STRING_8</code>.

<syntaxhighlight lang="eiffel">
class STRING_8
...
create
make_from_cil
...
convert
make_from_cil ({SYSTEM_STRING})
to_cil: {SYSTEM_STRING}
...
</syntaxhighlight>

The assignment:

<syntaxhighlight lang="eiffel">
my_system_string := my_string
</syntaxhighlight>

then, becomes equivalent to:

<syntaxhighlight lang="eiffel">
my_system_string := my_string.to_cil
</syntaxhighlight>

In Eiffel, the setup for type conversion is included in the class code, but then appears to happen as automatically as [[#Explicit type conversion|explicit type conversion]] in client code. The includes not just assignments but other types of attachments as well, such as argument (parameter) substitution.

===Rust===
[[Rust (programming language)|Rust]] provides no implicit type conversion (coercion) between primitive types. But, explicit type conversion (casting) can be performed using the <code>as</code> keyword.<ref>{{cite web |title=Casting - Rust By Example |url=https://doc.rust-lang.org/rust-by-example/types/cast.html |website=doc.rust-lang.org}}</ref>
<syntaxhighlight lang="rust">
let x = 1000;
println!("1000 as a u16 is: {}", x as u16);
</syntaxhighlight>

== Implicit casting using untagged unions ==
Many programming languages support [[Union type|union types]] which can hold a value of multiple types. ''Untagged'' unions are provided in some languages with loose type-checking, such as [[C (programming language)|C]] and [[PL/I]], but also in the original [[Pascal (programming language)|Pascal]]. These can be used to interpret the bit pattern of one type as a value of another type.

==Security issues==
In [[hacker (computer security)|hacking]], typecasting is the misuse of type conversion to temporarily change a [[variable (computer science)|variable]]'s data type from how it was originally defined.<ref>Jon Erickson ''Hacking, 2nd Edition: The Art of Exploitation'' 2008 1593271441 p51 "Typecasting is simply a way to temporarily change a variable's data type, despite how it was originally defined. When a variable is typecast into a different type, the compiler is basically told to treat that variable as if it were the new data type, but only for that operation. The syntax for typecasting is as follows: (typecast_data_type) variable ..."</ref> This provides opportunities for hackers since in type conversion after a variable is "typecast" to become a different data type, the compiler will treat that hacked variable as the new data type for that specific operation.<ref>Arpita Gopal ''Magnifying C'' 2009 8120338618 p. 59 "From the above, it is clear that the usage of typecasting is to make a variable of one type, act like another type for one single operation. So by using this ability of typecasting it is possible for create ASCII characters by typecasting integer to its ..."</ref>

==See also==
* [[Downcasting]]
* {{section link|Run-time type information|C++ – dynamic cast and Java cast}}
* [[Type punning]]

==References==
{{Reflist}}

== External links ==
* [http://www.adapower.com/index.php?Command=Class&ClassID=FAQ&CID=354 Casting in Ada]
* [[Wikibooks:C++ Programming/Programming Languages/C++/Code/Statements/Variables/Type Casting|Casting in C++]]
* [https://web.archive.org/web/20160709112746/http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=285 C++ Reference Guide] Why I hate C++ Cast Operators, by Danny Kalev
* [https://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.5 Casting in Java]
* [http://msdn.microsoft.com/en-us/library/aa691280(v=vs.71).aspx Implicit Conversions in C#]
* [http://cppreference.com/wiki/language/implicit_cast Implicit Type Casting at Cppreference.com]
* [http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/cast.html Static and Reinterpretation castings in C++]
* [[Wikibooks:F Sharp Programming/Inheritance#Up-casting and Down-casting|Upcasting and Downcasting in F#]]

{{Data types}}

{{DEFAULTSORT:Type Conversion}}
[[Category:Data types]]
[[Category:Data types]]
[[Category:Operators (programming)]]
[[de:Typumwandlung]]
[[Category:Type theory]]
[[Category:Unary operations]]

Latest revision as of 08:31, 23 May 2024

In computer science, type conversion,[1][2] type casting,[1][3] type coercion,[3] and type juggling[4][5] are different ways of changing an expression from one data type to another. An example would be the conversion of an integer value into a floating point value or its textual representation as a string, and vice versa. Type conversions can take advantage of certain features of type hierarchies or data representations. Two important aspects of a type conversion are whether it happens implicitly (automatically) or explicitly,[1][6] and whether the underlying data representation is converted from one representation into another, or a given representation is merely reinterpreted as the representation of another data type.[6][7] In general, both primitive and compound data types can be converted.

Each programming language has its own rules on how types can be converted. Languages with strong typing typically do little implicit conversion and discourage the reinterpretation of representations, while languages with weak typing perform many implicit conversions between data types. Weak typing language often allow forcing the compiler to arbitrarily interpret a data item as having different representations—this can be a non-obvious programming error, or a technical method to directly deal with underlying hardware.

In most languages, the word coercion is used to denote an implicit conversion, either during compilation or during run time. For example, in an expression mixing integer and floating point numbers (like 5 + 0.1), the compiler will automatically convert integer representation into floating point representation so fractions are not lost. Explicit type conversions are either indicated by writing additional code (e.g. adding type identifiers or calling built-in routines) or by coding conversion routines for the compiler to use when it otherwise would halt with a type mismatch.

In most ALGOL-like languages, such as Pascal, Modula-2, Ada and Delphi, conversion and casting are distinctly different concepts. In these languages, conversion refers to either implicitly or explicitly changing a value from one data type storage format to another, e.g. a 16-bit integer to a 32-bit integer. The storage needs may change as a result of the conversion, including a possible loss of precision or truncation. The word cast, on the other hand, refers to explicitly changing the interpretation of the bit pattern representing a value from one type to another. For example, 32 contiguous bits may be treated as an array of 32 booleans, a 4-byte string, an unsigned 32-bit integer or an IEEE single precision floating point value. Because the stored bits are never changed, the programmer must know low level details such as representation format, byte order, and alignment needs, to meaningfully cast.

In the C family of languages and ALGOL 68, the word cast typically refers to an explicit type conversion (as opposed to an implicit conversion), causing some ambiguity about whether this is a re-interpretation of a bit-pattern or a real data representation conversion. More important is the multitude of ways and rules that apply to what data type (or class) is located by a pointer and how a pointer may be adjusted by the compiler in cases like object (class) inheritance.

Explicit casting in various languages[edit]

Ada[edit]

Ada provides a generic library function Unchecked_Conversion.[8]

C-like languages[edit]

Implicit type conversion[edit]

Implicit type conversion, also known as coercion or type juggling, is an automatic type conversion by the compiler. Some programming languages allow compilers to provide coercion; others require it.

In a mixed-type expression, data of one or more subtypes can be converted to a supertype as needed at runtime so that the program will run correctly. For example, the following is legal C language code:

double  d;
long    l;
int     i;

if (d > i)   d = i;
if (i > l)   l = i;
if (d == l)  d *= 2;

Although d, l, and i belong to different data types, they will be automatically converted to equal data types each time a comparison or assignment is executed. This behavior should be used with caution, as unintended consequences can arise. Data can be lost when converting representations from floating-point to integer, as the fractional components of the floating-point values will be truncated (rounded toward zero). Conversely, precision can be lost when converting representations from integer to floating-point, since a floating-point type may be unable to exactly represent all possible values of some integer type. For example, float might be an IEEE 754 single precision type, which cannot represent the integer 16777217 exactly, while a 32-bit integer type can. This can lead to unintuitive behavior, as demonstrated by the following code:

#include <stdio.h>

int main(void)
{
    int i_value   = 16777217;
    float f_value = 16777216.0;
    printf("The integer is: %d\n", i_value);
    printf("The float is:   %f\n", f_value);
    printf("Their equality: %d\n", i_value == f_value);
}

On compilers that implement floats as IEEE single precision, and ints as at least 32 bits, this code will give this peculiar print-out:

The integer is: 16777217
The float is: 16777216.000000
Their equality: 1

Note that 1 represents equality in the last line above. This odd behavior is caused by an implicit conversion of i_value to float when it is compared with f_value. The conversion causes loss of precision, which makes the values equal before the comparison.

Important takeaways:

  1. float to int causes truncation, i.e., removal of the fractional part.
  2. double to float causes rounding of digit.
  3. long to int causes dropping of excess higher order bits.
Type promotion[edit]

One special case of implicit type conversion is type promotion, where an object is automatically converted into another data type representing a superset of the original type. Promotions are commonly used with types smaller than the native type of the target platform's arithmetic logic unit (ALU), before arithmetic and logical operations, to make such operations possible, or more efficient if the ALU can work with more than one type. C and C++ perform such promotion for objects of boolean, character, wide character, enumeration, and short integer types which are promoted to int, and for objects of type float, which are promoted to double. Unlike some other type conversions, promotions never lose precision or modify the value stored in the object.

In Java:

int x = 3;
double y = 3.5;
System.out.println(x + y); // The output will be 6.5

Explicit type conversion[edit]

Explicit type conversion, also called type casting, is a type conversion which is explicitly defined within a program (instead of being done automatically according to the rules of the language for implicit type conversion). It is requested by the user in the program.

double da = 3.3;
double db = 3.3;
double dc = 3.4;
int result = (int)da + (int)db + (int)dc; // result == 9
// if implicit conversion would be used (as with "result = da + db + dc"), result would be equal to 10

There are several kinds of explicit conversion.

checked
Before the conversion is performed, a runtime check is done to see if the destination type can hold the source value. If not, an error condition is raised.
unchecked
No check is performed. If the destination type cannot hold the source value, the result is undefined.
bit pattern
The raw bit representation of the source is copied verbatim, and it is re-interpreted according to the destination type. This can also be achieved via aliasing.

In object-oriented programming languages, objects can also be downcast : a reference of a base class is cast to one of its derived classes.

C# and C++[edit]

In C#, type conversion can be made in a safe or unsafe (i.e., C-like) manner, the former called checked type cast.[9]

Animal animal = new Cat();

Bulldog b = (Bulldog) animal;  // if (animal is Bulldog), stat.type(animal) is Bulldog, else an exception
b = animal as Bulldog;         // if (animal is Bulldog), b = (Bulldog) animal, else b = null

animal = null;
b = animal as Bulldog;         // b == null

In C++ a similar effect can be achieved using C++-style cast syntax.

Animal* animal = new Cat;

Bulldog* b = static_cast<Bulldog*>(animal); // compiles only if either Animal or Bulldog is derived from the other (or same)
b = dynamic_cast<Bulldog*>(animal);         // if (animal is Bulldog), b = (Bulldog*) animal, else b = nullptr

Bulldog& br = static_cast<Bulldog&>(*animal); // same as above, but an exception will be thrown if a nullptr was to be returned
                                              // this is not seen in code where exception handling is avoided

delete animal; // always free resources
animal = nullptr;
b = dynamic_cast<Bulldog*>(animal);         // b == nullptr

Eiffel[edit]

In Eiffel the notion of type conversion is integrated into the rules of the type system. The Assignment Rule says that an assignment, such as:

x := y

is valid if and only if the type of its source expression, y in this case, is compatible with the type of its target entity, x in this case. In this rule, compatible with means that the type of the source expression either conforms to or converts to that of the target. Conformance of types is defined by the familiar rules for polymorphism in object-oriented programming. For example, in the assignment above, the type of y conforms to the type of x if the class upon which y is based is a descendant of that upon which x is based.

Definition of type conversion in Eiffel[edit]

The actions of type conversion in Eiffel, specifically converts to and converts from are defined as:

A type based on a class CU converts to a type T based on a class CT (and T converts from U) if either

CT has a conversion procedure using U as a conversion type, or
CU has a conversion query listing T as a conversion type

Example[edit]

Eiffel is a fully compliant language for Microsoft .NET Framework. Before development of .NET, Eiffel already had extensive class libraries. Using the .NET type libraries, particularly with commonly used types such as strings, poses a conversion problem. Existing Eiffel software uses the string classes (such as STRING_8) from the Eiffel libraries, but Eiffel software written for .NET must use the .NET string class (System.String) in many cases, for example when calling .NET methods which expect items of the .NET type to be passed as arguments. So, the conversion of these types back and forth needs to be as seamless as possible.

    my_string: STRING_8                 -- Native Eiffel string
    my_system_string: SYSTEM_STRING     -- Native .NET string

        ...

            my_string := my_system_string

In the code above, two strings are declared, one of each different type (SYSTEM_STRING is the Eiffel compliant alias for System.String). Because System.String does not conform to STRING_8, then the assignment above is valid only if System.String converts to STRING_8.

The Eiffel class STRING_8 has a conversion procedure make_from_cil for objects of type System.String. Conversion procedures are also always designated as creation procedures (similar to constructors). The following is an excerpt from the STRING_8 class:

    class STRING_8
        ...
    create
        make_from_cil
        ...
    convert
        make_from_cil ({SYSTEM_STRING})
        ...

The presence of the conversion procedure makes the assignment:

            my_string := my_system_string

semantically equivalent to:

            create my_string.make_from_cil (my_system_string)

in which my_string is constructed as a new object of type STRING_8 with content equivalent to that of my_system_string.

To handle an assignment with original source and target reversed:

            my_system_string := my_string

the class STRING_8 also contains a conversion query to_cil which will produce a System.String from an instance of STRING_8.

    class STRING_8
        ...
    create
        make_from_cil
        ...
    convert
        make_from_cil ({SYSTEM_STRING})
        to_cil: {SYSTEM_STRING}
        ...

The assignment:

            my_system_string := my_string

then, becomes equivalent to:

            my_system_string := my_string.to_cil

In Eiffel, the setup for type conversion is included in the class code, but then appears to happen as automatically as explicit type conversion in client code. The includes not just assignments but other types of attachments as well, such as argument (parameter) substitution.

Rust[edit]

Rust provides no implicit type conversion (coercion) between primitive types. But, explicit type conversion (casting) can be performed using the as keyword.[10]

let x = 1000;
println!("1000 as a u16 is: {}", x as u16);

Implicit casting using untagged unions[edit]

Many programming languages support union types which can hold a value of multiple types. Untagged unions are provided in some languages with loose type-checking, such as C and PL/I, but also in the original Pascal. These can be used to interpret the bit pattern of one type as a value of another type.

Security issues[edit]

In hacking, typecasting is the misuse of type conversion to temporarily change a variable's data type from how it was originally defined.[11] This provides opportunities for hackers since in type conversion after a variable is "typecast" to become a different data type, the compiler will treat that hacked variable as the new data type for that specific operation.[12]

See also[edit]

References[edit]

  1. ^ a b c Mehrotra, Dheeraj (2008). S. Chand's Computer Science. S. Chand. pp. 81–83. ISBN 978-8121929844.
  2. ^ Programming Languages - Design and Constructs. Laxmi Publications. 2013. p. 35. ISBN 978-9381159415.
  3. ^ a b Reilly, Edwin (2004). Concise Encyclopedia of Computer Science. John Wiley & Sons. pp. 82, 110. ISBN 0470090952.
  4. ^ Fenton, Steve (2017). Pro TypeScript: Application-Scale JavaScript Development. Apress. pp. xxiii. ISBN 978-1484232491.
  5. ^ "PHP: Type Juggling - Manual". php.net. Retrieved 27 January 2019.
  6. ^ a b Olsson, Mikael (2013). C++ Quick Syntax Reference. Apress. pp. 87–89. ISBN 978-1430262770.
  7. ^ Kruse, Rudolf; Borgelt, Christian; Braune, Christian; Mostaghim, Sanaz; Steinbrecher, Matthias (16 September 2016). Computational Intelligence: A Methodological Introduction. Springer. p. 269. ISBN 978-1447172963.
  8. ^ "Unchecked Type Conversions". Ada Information Clearinghouse. Retrieved 11 March 2023.
  9. ^ Mössenböck, Hanspeter (25 March 2002). "Advanced C#: Checked Type Casts" (PDF). Institut für Systemsoftware, Johannes Kepler Universität Linz, Fachbereich Informatik. p. 5. Retrieved 4 August 2011. at C# Tutorial
  10. ^ "Casting - Rust By Example". doc.rust-lang.org.
  11. ^ Jon Erickson Hacking, 2nd Edition: The Art of Exploitation 2008 1593271441 p51 "Typecasting is simply a way to temporarily change a variable's data type, despite how it was originally defined. When a variable is typecast into a different type, the compiler is basically told to treat that variable as if it were the new data type, but only for that operation. The syntax for typecasting is as follows: (typecast_data_type) variable ..."
  12. ^ Arpita Gopal Magnifying C 2009 8120338618 p. 59 "From the above, it is clear that the usage of typecasting is to make a variable of one type, act like another type for one single operation. So by using this ability of typecasting it is possible for create ASCII characters by typecasting integer to its ..."

External links[edit]