Version: v0.7.0 - Beta.  We welcome contributors & feedback.

Language Design Notes

About

This page is list of reasons behind many of the design decisions made for THT.

Ultimately, every decision is a trade-off, trying to balance many factors like familiarity, simplicity, safety, etc.

Not every decision will appeal to everyone, but I hope this page will show that a lot of thought was put into every part of the language.

Design Principles

Here are some of the higher-level principles that helped guide these decisions:

Confidence Score

The “V1 Confidence” percentage for each feature shows the likelihood that it will remain unchanged by Version 1.0.

Nothing is etched in stone, so feedback from Beta users could change things.

Comparisons

In a couple of cases, I refer to the Laravel framework, because it is a larger codebase and an example of well-written, modern PHP code.

Contents

Dot Methods 

// PHP
Module->method()

// THT
Module.method()

THT replaces the arrow notation of PHP's method calls because they are simply easier to type and create less visual noise.

This approach is mainstream, and hopefully non-controversial.

Other languages that use dot method calls: JavaScript, Java, Python, Ruby, Swift

V1 Confidence: 100%

JSON Maps 

// PHP
[ 'key' => 'value', 'num' => 123 ]

// THT
{ key: 'value', num: 123 }

THT replaces arrow/bracket notation with JS object-literal notation.

This is easier to type and contains less visual noise, and should be familiar to all web developers.

V1 Confidence: 100%

Prefixed Binary Operators 

// PHP
$result = $op1 | $op2

// THT
$result = $op +| $op2

Binary operators are almost never used in web development, and the operators are too easily mistaken for their logical counterparts.

If you want a bitmask, most of the time a keyword Map will be easier to work with.

This approach is borrowed from Raku (Perl 6).

V1 Confidence: 100%

No Semicolons 

// PHP
$a += 1;
print($a);

// THT
$a += 1
print($a)

This reduces visual noise, and it reinforces the good practice of having only one statement per line.

It will sometimes be inconvenient for those of us with semicolons engrained in our muscle memory, but THT gives super clear feedback when a mistake is made, so it’s easy to fix.

Other languages that don’t use/require semicolons: Python, Ruby, Swift, Go, Lua

V1 Confidence: 90%

No Outer Parens 

// PHP
if ($condition) {
    ...
}

// THT
if $condition {
    ...
}

This reduces visual noise, and makes it easier to balance parens that are within the condition itself.

Other languages that don’t require parens: Python, Ruby, Rust, Swift, Go

V1 Confidence: 90%

Dollar Variables 

// THT & PHP
$myVar = 123

THT keeps the dollar “sigils” in variables to retain its identity as a PHP-based programming language.

This admittedly goes against the idea of removing visual noise, but in this case, familiarity is more important.

V1 Confidence: 90%

Single-Quoted Strings 

// PHP
$myString = 'Hello ' . "World!"

// THT
$myString = 'Hello ' ~ 'World!'

In PHP, the ability to choose between single or double-quotes is sometimes useful.

However, because string literals are extremely common, this leads to hundreds (maybe thousands) of micro-decisions per project.

Single quotes were chosen for THT because they are a little easier to type (no SHIFT key), and they create a little less visual noise.

Note: Interpolation is TBD, but THT currently has multiple ways to insert text via .fill(), the ~ stringy operator, and template functions.

V1 Confidence: 90%

No for Loop 

“For what it’s worth, we don’t have a single C style for loop in the Lyft codebase.” — Keith Smiley, Lyft
// PHP
for ($i = 1; $i <= 10; $i++) { ... }

// THT
foreach range(1, 10) as $i { ... }

In any language with a foreach (or for in) construct, the C-style for loop is mostly unnecessary because the vast majority of loops iterate over a collection or a range of numbers.

For example, the Laravel project uses foreach 638 times and for 9 times (7 of those could be written as a foreach/range). That’s about 300-to-1.

Languages that follow this pattern: Python, Ruby, Swift, Rust

V1 Confidence: 90%

No Unary Increment ++ and -- 

// PHP
if (++$myVar) { ... }

// THT
$myVar += 1
if $myVar { ... }

For such a simple operation (adding 1), this operator is quite complicated.

It often tempts programmers to write “clever” code that mixes mutation with evaluation, and is further complicated by behaving differently when it appears before or after the subject.

You can simply use += 1 instead.

Languages that don’t use ++: Python, Ruby, Swift, Rust

V1 Confidence: 90%

No while Loop 

// PHP
$status = true
while ($status) {
    $status = doSomething();
    if (!$status) { break; }
}

// THT
loop {
    $status = doSomething()
    if !$status: break
}

The while operator often leads to off-by-one errors and redundant initialization.

The do/while construct complicates things further, as the only language feature that turns the conventional (predicate) { block } convention upside down.

THT's loop codifies the convention of a while (true) loop, giving you total control over the order of operations and where the loop should break.

Keep in mind that while isn’t used very often, so this isn’t a high impact change. In the Laravel project, while appears in 1 out of every 680 logical lines of code.

Languages with loop: Rust

V1 Confidence: 80%

CamelCase Names 

// PHP
$my_variable = HTTPClass::myFunction()

// THT
$myVariable = HttpClass.myFunction()

In THT, everything is camelCase.

I realize this might be a deal-breaker for some people. I honestly don’t have a strong preference, but the consensus among professional programmers is that having a single consistent style is important, regardless of what is used.

Benefits of a single language-level style:

CamelCase was chosen for a few reasons:

It also allows the THT compiler to do some things like:

Languages that use camelCase convention: JavaScript, Java, Swift

V1 Confidence: 80%

No Null 

Tony Hoare, the creator of Null, called it his “Billion Dollar Mistake”.

Null complicates programs because it means that every variable can have a 3rd state that overlaps with all other types, and it can trigger errors far from where it was set.

Modern languages favor the Option Type pattern instead, which THT supports via Result.

As THT's object system is developed further, we will examine safer ways to re-introduce the concept of an uninitialized object.

Languages that favor the Option Type pattern: Rust, Swift

V1 Confidence: 70%

1-Based Indexing 

$list = ['a', 'b', 'c']

// PHP
$list[0] //= 'a'
$list[1] //= 'b'
$list[2] //= 'c'

// THT
$list[1] //= 'a'
$list[2] //= 'b'
$list[3] //= 'c'

Unlike PHP and many other C-derived languages, THT starts at one when counting indexes, instead of zero.

One of the theories as to why zero-based indexes were chosen for C, is that it was more efficient and convenient to treat indexes as offsets in memory. (i.e. the first element is 0 places from the start.)

A high-level language like THT doesn’t involve direct memory manipulation, so this is no longer necessary. Also, any tiny performance benefits that were necessary 50 years ago are irrelevant on today’s machines (the PDP-11 ran at about 1.2 MHz).

Note that languages dedicated to math and statistics, like Mathematica and Julia, actually use 1-indexing.

Here are a number of practical/cognitive benefits:

1: 1st
2: 2nd
3: 3rd
// 1-Index
if !$list.indexOf('X'): return

// 0-Index
if $list.indexOf('X') == -1: return

Other languages use out-of-band values like false or -1 to indicate a missing item. This forces the programmer to use extra caution to make sure it isn’t interpreted as a valid index.

$badIndex = $list.indexOf('missing')

// THT (0)
$list[$badIndex]  // ✓ Immediate error (safe)

// PHP (false)
$list[$badIndex]  // ✕ BUG! Gets the 1st element (false == 0)

// JavaScript (-1)
$list[$badIndex]  // ✕ BUG! Gets the last element
// 1-Index
$isLastItem = $index == $list.length()

// 0-Index
$isLastItem = $index == $list.length() - 1

This makes ranges a bit simpler as well, with less confusion around inclusive vs exclusive ranges.

// 1-Index
foreach range(1, $list.length()) as $i { ... }

// 0-Index
foreach range(0, $list.length() - 1) as $i { ... }
$one = $list[1]
$two = $list[2]

$lastOne = $list[-1]
$lastTwo = $list[-2]
When starting with subscript 1, the subscript range 1 ≤ i < N+1; starting with 0, however, gives the nicer range 0 ≤ i < N

First, he offers no real practical benefit for programmers, only an aesthetic argument (One could argue that “1 ≤ i ≤ N” is even nicer.)

He mentions an anecdote about a defunct language called Mesa, but offers no details.

Second, it’s largely moot nowadays, since the foreach construct means we rarely work with ranges directly. (The Laravel project only uses ranges 7 times in 20,000 logical lines of code.)

Other languages that use 1-based indexing: Smalltalk, Mathematica, R, Lua, Julia.

V1 Confidence: 70%