Operators in programming languages
Do operators have to be common between programming languages? Or do we even need symbols for adding numbers together? Let’s look at some weird differences!
The most common “mistakes”
JavaScript.
Gary Bernhardt already showed us:
in this short funny talk:
Let’s talk about JavaScript. Adding number 2
to the string "3"
(which means "3" + 2
) would be different than subtracting those (meaning "3" - 2
). The first results in 32
, the second results in 1
.
"3" + 2 = 32
"3" - 2 = 1
Why does this happen? Let’s decompose "3" + 2
to a Abstract Syntax Tree (AST) using Esprima Parser:
{
"type": "Program",
"body": [
{
"type": "ExpressionStatement",
"expression": {
"type": "BinaryExpression",
"operator": "+",
"left": {
"type": "Literal",
"value": "3",
"raw": "\"3\""
},
"right": {
"type": "Literal",
"value": 2,
"raw": "2"
}
}
}
],
"sourceType": "script"
}
First, there is an Expression Statement, which consists of only a Binary Expression. Binary expression means it’s an expression taking 2 values. The operator for this binary expr is +
and it has 1 “literal” arguments.
So, let’s follow how the result was produced:
- parser decided to use a binary operator
+
on arguments"3"
(string) and2
(number) - the left argument is
3
which is a string! - now, the REAL operator is chosen.
+
could work differently if the “left” expression (just3
here) was a different type than string - the operator was chosen for adding two strings. This means concatenation (joining two strings)
- So. Concatenate
"3"
with2
. Ooops! Second argument is not a string? Implicitly (without telling you) convert2
to a string! - Now concatenate
"3"
with"2"
. It’s"32"
.
If there was operator -
then… Strings don’t have such operator defined. Only numbers do, so left argument would be converted to a number. So "3" - 2
would be converted to `3 - 2``.
+
as a multi-operator?
That’s what JavaScript choose to do. There are more languages that don’t do that. For instance, let’s look at PHP:
+
(plus) is for numbers (with implicit conversion to numbers).
(dot) is for string concatenation (with implicit conversion to string, too)+
(plus again!) for concatenating arrays… well, that was kind of unexpected
Anyway, an expression like "a" + "5"
doesn’t work in any specified way as in JavaScript. It would result simply in 5
. A numeric 5
.
What your +
means in there?
C++ doesn’t have that - you add only numbers, no concatenation this way. Except the fact, that operators are overloadable in classes. So two objects of type std::string
are able to be concatenated by +
.
Same with Ruby but there’s even more - you can overload operators dynamically - in the runtime. In C++ custom operators have to be greenlit by the compiler.
But hey, do you mean there is no implicit conversions? There is! In C++ there there’s something more than numbers. Let’s differentiate…
Integers and Floats
15
is an int by default but could a bigger type **long **or be a smaller value, like short or byte.
15f
is float. 15.0
is a double (bigger float, let’s say).
Passing value 0
(integer) as a function argument that takes float, will (until compiler is configured otherwise) implicitly convert it to 0f
.
In JavaScript and PHP number is just a number. No matter how big it is. Because those languages don’t want you to care about bits.
And here’s the OCaml:
+
for adding integers+.
for adding floats
What about Elm?
//
(two slashes) for dividing integers/
(one slash) for dividing floats!
Association operator
That’s an interesting one. Most of the languages are all about =
operator. Pascal differentiated assignment operator from the equality operator. Then equality would be =
while assignment :=
- pretty often used in books for the same reason, to differentiate operations.
Equality operators?
In JavaScript and PHP the ==
operator checks for value equality. Warning. Implicit conversion is there! It means 2 == "2"
is true. Meanwhile, there is ===
which first compares types of values and then values. 2 == "2"
is false and 2 == 2
or "2" == "2"
is true.
Not equal? !=
or !==
. Same rules about type checking.
But look at Elm: =/
- hard to get to used to. It’s little easier in Erlang where =/=
checks for inequality. It looks more like mathematical crossed equal sign than in Elm.
Comparing things
Comparing values may be even harder when you look at SQL dialects. Well, empty string may not be different from NULL in some cases. Here I’ll link just an example:
source: One of WTFs in Oracle SQL
Oracle has a couple of SQL WTF issues.
- Oracle’s treatment of empty strings as null.
- Treatment of null values in a “<>” comparison.
create table wtf (key number primary key, animal varchar2(10));
insert into wtf values (1,‘dog’);
insert into wtf values (2,“);
insert into wtf values (3,‘cat’);
select * from wtf where animal <> ‘cat’;
The only row returned is the (1,'dog')
row.
Or let’s get back to JavaScript:
JavaScript truth table:
'' == '0' // false 0 == '' // true 0 == '0' // true false == 'false' // false false == '0' // true false == undefined // false false == null // false null == undefined // true " \t\r\n" == 0 // true
There are dozens more to learn about comparing things in various languages. But let’s stop there and…
…let’s port our project to a new language.
Since we are talking only about operators, let’s focus on equality operators. While porting your project you’ll definetely have problems with:
- comparing objects
- comparing collections
- comparing Strings
(please note: not all languages treat Strings as objects etc.)
Let’s take a look at Java Strings.
Compiled String has a reference. Comparing two Strings of same reference works with operator ==
. It doesn’t compare types or values but just reference ID.
However, if your second String is built in runtime, then it may have the same value but different reference because it’s a new object. Then we need to do str1.equals(str2)
instead of str1 == str2
.
In Erlang - string:equal("abc", "abc")
Built-in variables looking like operators!
For operator-looking variables I’ll just quote the case of the Perl language:
Perl’s many built-in variables:
$#
— not a comment!$0
,$$
, and$?
— just like the shell variables by the same name$ˋ
,$&
, and$'
— weird matching variables$"
and$,
— weird variables for list- and output-field-separators$!
— likeerrno
as a number butstrerror(errno)
as a string$_
— the stealth variable, always used and never seen$#_
— index number of the last subroutine argument… maybe@_
— the (non)names of the current function… maybe$@
— the last-raised exception%::
— the symbol table$:
,$^
,$~
,$-
, and$=
— something to do with output formats$.
and$%
— input line number, output page number$/
and$\
— input and output record separators$|
— output buffering controller$[
— change your array base from 0-based to 1-based to 42-based: WHEEE!$}
— nothing at all, oddly enough!$<
,$>
,$(
,$)
— real and effective UIDs and GIDs@ISA
— names of current package’s direct superclasses$^T
— script start-up time in epoch seconds$^O
— current operating system name$^V
— what version of Perl this is
There’s a lot more where those came from. Read the complete list here.
Summary
During a choice of a language for new project, it’s pretty reasonable to decide if we need certain features. Syntax is the last thing to choose.
Oh, you’re rather thinking about syntax in languages first? Then just look at operators. Already lots to learn about. My advice: choose those languages which have well determined operators and are not Perl.
References
- Esprima Parser - web tool that decomposes JavaScript to AST
- StackOverflow: Strangest Language Feature - very interesting cases in various languages!
- StackOverflow answer: Perl’s many built-in variables - all looking like operators!
- StackOverflow answer: JavaScript truth table
- Wat by @garybernhardt
Languages
- C++
- Elm
- Erlang
- Java
- JavaScript
- OCaml
- Oracle SQL
- Perl
- PHP