0% found this document useful (0 votes)
81 views

Methods of Primitives: Objects: Advanced Data Types Page 1 of 26

Uploaded by

Jaja Arcenal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Methods of Primitives: Objects: Advanced Data Types Page 1 of 26

Uploaded by

Jaja Arcenal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Objects : Advanced Data Types Page 1 of 26

Methods of primitives

JavaScript allows us to work with primitives (strings, numbers, etc.) as if they were objects.

They also provide methods to call as such. We will study those soon, but first we’ll see how it works
because, of course, primitives are not objects (and here we will make it even clearer).

Let’s look at the key distinctions between primitives and objects.

A primitive

 Is a value of a primitive type.


 There are 6 primitive types: string, number, boolean, symbol, null and undefined.

An object

 Is capable of storing multiple values as properties.


 Can be created with {}, for instance: {name: "John", age: 30}. There are other kinds of
objects in JavaScript: functions, for example, are objects.

One of the best things about objects is that we can store a function as one of its properties.

let john = {
name: "John",
sayHi: function() {
alert("Hi buddy!");
}
};

john.sayHi(); // Hi buddy!

So here we’ve made an object john with the method sayHi.

Many built-in objects already exist, such as those that work with dates, errors, HTML elements, etc.
They have different properties and methods.

But, these features come with a cost!

Objects are “heavier” than primitives. They require additional resources to support the internal
machinery. But as properties and methods are very useful in programming, JavaScript engines try to
optimize them to reduce the additional burden.

A primitive as an object

Here’s the paradox faced by the creator of JavaScript:

 There are many things one would want to do with a primitive like a string or a number. It
would be great to access them as methods.
 Primitives must be as fast and lightweight as possible.

The solution looks a little bit awkward, but here it is:


Lesson 3  jyercia
Objects : Advanced Data Types Page 2 of 26

1. Primitives are still primitive. A single value, as desired.


2. The language allows access to methods and properties of strings, numbers, booleans and
symbols.
3. In order for that to work, a special “object wrapper” that provides the extra functionality is
created, and then is destroyed.

The “object wrappers” are different for each primitive type and are called: String, Number, Boolean
and Symbol. Thus, they provide different sets of methods.

For instance, there exists a method str.toUpperCase() that returns a capitalized string.

Here’s how it works:

let str = "Hello";

alert( str.toUpperCase() ); // HELLO

Simple, right? Here’s what actually happens in str.toUpperCase():

1. The string str is a primitive. So in the moment of accessing its property, a special object is
created that knows the value of the string, and has useful methods, like toUpperCase().
2. That method runs and returns a new string (shown by alert).
3. The special object is destroyed, leaving the primitive str alone.

So primitives can provide methods, but they still remain lightweight.

The JavaScript engine highly optimizes this process. It may even skip the creation of the extra object
at all. But it must still adhere to the specification and behave as if it creates one.

A number has methods of its own, for instance, toFixed(n) rounds the number to the given precision:

let n = 1.23456;

alert( n.toFixed(2) ); // 1.23

We’ll see more specific methods in chapters Numbers and Strings.

Constructors String/Number/Boolean are for internal use only

Some languages like Java allow us to create “wrapper objects” for primitives explicitly using a
syntax like new Number(1) or new Boolean(false).

In JavaScript, that’s also possible for historical reasons, but highly unrecommended. Things will go
crazy in several places.

For instance:

alert( typeof 0 ); // "number"

alert( typeof new Number(0) ); // "object"!


Objects are always truthy in if, so here the alert will show up:
let zero = new Number(0);

Lesson 3  jyercia
Objects : Advanced Data Types Page 3 of 26

if (zero) { // zero is true, because it's an object


alert( "zero is truthy!?!" );
}

On the other hand, using the same functions String/Number/Boolean without new is a totally sane
and useful thing. They convert a value to the corresponding type: to a string, a number, or a boolean
(primitive).

For example, this is entirely valid:

let num = Number("123"); // convert a string to number


null/undefined have no methods

The special primitives null and undefined are exceptions. They have no corresponding “wrapper
objects” and provide no methods. In a sense, they are “the most primitive”.

An attempt to access a property of such value would give the error:

alert(null.test); // error

Summary

 Primitives except null and undefined provide many helpful methods. We will study those in
the upcoming chapters.
 Formally, these methods work via temporary objects, but JavaScript engines are well tuned to
optimize that internally, so they are not expensive to call.

Tasks
1. Consider the following code:

let str = "Hello";

str.test = 5;
alert(str.test);

How do you think, will it work? What will be shown?

Lesson 3  jyercia
Objects : Advanced Data Types Page 4 of 26

Numbers

All numbers in JavaScript are stored in 64-bit format IEEE-754, also known as “double precision
floating point numbers”.

Let’s recap and expand upon what we currently know about them.

More ways to write a number

Imagine we need to write 1 billion. The obvious way is:

let billion = 1000000000;

But in real life, we usually avoid writing a long string of zeroes as it’s easy to mistype. Also, we are
lazy. We will usually write something like "1bn" for a billion or "7.3bn" for 7 billion 300 million.
The same is true for most large numbers.

In JavaScript, we shorten a number by appending the letter "e" to the number and specifying the
zeroes count:

let billion = 1e9; // 1 billion, literally: 1 and 9 zeroes

alert( 7.3e9 ); // 7.3 billions (7,300,000,000)


In other words, "e" multiplies the number by 1 with the given zeroes count.
1e3 = 1 * 1000
1.23e6 = 1.23 * 1000000

Now let’s write something very small. Say, 1 microsecond (one millionth of a second):

let ms = 0.000001;

Just like before, using "e" can help. If we’d like to avoid writing the zeroes explicitly, we could say:

let ms = 1e-6; // six zeroes to the left from 1

If we count the zeroes in 0.000001, there are 6 of them. So naturally it’s 1e-6.

In other words, a negative number after "e" means a division by 1 with the given number of zeroes:

// -3 divides by 1 with 3 zeroes


1e-3 = 1 / 1000 (=0.001)

// -6 divides by 1 with 6 zeroes


1.23e-6 = 1.23 / 1000000 (=0.00000123)

Hex, binary and octal numbers

Hexadecimal numbers are widely used in JavaScript to represent colors, encode characters, and for
many other things. So naturally, there exists a shorter way to write them: 0x and then the number.

For instance:

Lesson 3  jyercia
Objects : Advanced Data Types Page 5 of 26

alert( 0xff ); // 255


alert( 0xFF ); // 255 (the same, case doesn't matter)

Binary and octal numeral systems are rarely used, but also supported using the 0b and 0o prefixes:

let a = 0b11111111; // binary form of 255


let b = 0o377; // octal form of 255

alert( a == b ); // true, the same number 255 at both sides

There are only 3 numeral systems with such support. For other numeral systems, we should use the
function parseInt (which we will see later in this chapter).

toString(base)

The method num.toString(base) returns a string representation of num in the numeral system with the
given base.

For example:

let num = 255;

alert( num.toString(16) ); // ff
alert( num.toString(2) ); // 11111111

The base can vary from 2 to 36. By default it’s 10.

Common use cases for this are:

 base=16 is used for hex colors, character encodings etc, digits can be 0..9 or A..F.

 base=2 is mostly for debugging bitwise operations, digits can be 0 or 1.

 base=36 is the maximum, digits can be 0..9 or A..Z. The whole latin alphabet is used to
represent a number. A funny, but useful case for 36 is when we need to turn a long numeric
identifier into something shorter, for example to make a short url. Can simply represent it in
the numeral system with base 36:

alert( 123456..toString(36) ); // 2n9c

Two dots to call a method

Please note that two dots in 123456..toString(36) is not a typo. If we want to call a method directly
on a number, like toString in the example above, then we need to place two dots .. after it.

If we placed a single dot: 123456.toString(36), then there would be an error, because JavaScript
syntax implies the decimal part after the first dot. And if we place one more dot, then JavaScript
knows that the decimal part is empty and now goes the method.

Also could write (123456).toString(36).

Rounding

Lesson 3  jyercia
Objects : Advanced Data Types Page 6 of 26

One of the most used operations when working with numbers is rounding.

There are several built-in functions for rounding:

Math.floor
Rounds down: 3.1 becomes 3, and -1.1 becomes -2.
Math.ceil
Rounds up: 3.1 becomes 4, and -1.1 becomes -1.
Math.round
Rounds to the nearest integer: 3.1 becomes 3, 3.6 becomes 4 and -1.1 becomes -1.
Math.trunc (not supported by Internet Explorer)
Removes anything after the decimal point without rounding: 3.1 becomes 3, -1.1 becomes -1.

Here’s the table to summarize the differences between them:

These functions cover all of the possible ways to deal with the decimal part of a number. But what if
we’d like to round the number to n-th digit after the decimal?

For instance, we have 1.2345 and want to round it to 2 digits, getting only 1.23.

There are two ways to do so:

1. Multiply-and-divide.

For example, to round the number to the 2nd digit after the decimal, we can multiply the
number by 100, call the rounding function and then divide it back.

let num = 1.23456;

alert( Math.floor(num * 100) / 100 ); // 1.23456 -> 123.456 -> 123 -> 1.23

The method toFixed(n) rounds the number to n digits after the point and returns a string
representation of the result.

let num = 12.34;


alert( num.toFixed(1) ); // "12.3"

This rounds up or down to the nearest value, similar to Math.round:

Lesson 3  jyercia
Objects : Advanced Data Types Page 7 of 26

let num = 12.36;


alert( num.toFixed(1) ); // "12.4"

Please note that result of toFixed is a string. If the decimal part is shorter than required,
zeroes are appended to the end:

let num = 12.34;


alert( num.toFixed(5) ); // "12.34000", added zeroes to make exactly 5 digits

2. We can convert it to a number using the unary plus or a Number() call: +num.toFixed(5).

Imprecise calculations

Internally, a number is represented in 64-bit format IEEE-754, so there are exactly 64 bits to store a
number: 52 of them are used to store the digits, 11 of them store the position of the decimal point
(they are zero for integer numbers), and 1 bit is for the sign.

If a number is too big, it would overflow the 64-bit storage, potentially giving an infinity:

alert( 1e500 ); // Infinity

What may be a little less obvious, but happens quite often, is the loss of precision.

Consider this (falsy!) test:

alert( 0.1 + 0.2 == 0.3 ); // false

That’s right, if we check whether the sum of 0.1 and 0.2 is 0.3, we get false.

Strange! What is it then if not 0.3?

alert( 0.1 + 0.2 ); // 0.30000000000000004

Ouch! There are more consequences than an incorrect comparison here. Imagine you’re making an e-
shopping site and the visitor puts $0.10 and $0.20 goods into their chart. The order total will be
$0.30000000000000004. That would surprise anyone.

But why does this happen?

A number is stored in memory in its binary form, a sequence of bits – ones and zeroes. But fractions
like 0.1, 0.2 that look simple in the decimal numeric system are actually unending fractions in their
binary form.

In other words, what is 0.1? It is one divided by ten 1/10, one-tenth. In decimal numeral system such
numbers are easily representable. Compare it to one-third: 1/3. It becomes an endless fraction
0.33333(3).

So, division by powers 10 is guaranteed to work well in the decimal system, but division by 3 is not.
For the same reason, in the binary numeral system, the division by powers of 2 is guaranteed to
work, but 1/10 becomes an endless binary fraction.

There’s just no way to store exactly 0.1 or exactly 0.2 using the binary system, just like there is no
way to store one-third as a decimal fraction.
Lesson 3  jyercia
Objects : Advanced Data Types Page 8 of 26

The numeric format IEEE-754 solves this by rounding to the nearest possible number. These
rounding rules normally don’t allow us to see that “tiny precision loss”, so the number shows up as
0.3. But beware, the loss still exists.

We can see this in action:

alert( 0.1.toFixed(20) ); // 0.10000000000000000555

And when we sum two numbers, their “precision losses” add up.

That’s why 0.1 + 0.2 is not exactly 0.3.

Not only JavaScript

The same issue exists in many other programming languages.

PHP, Java, C, Perl, Ruby give exactly the same result, because they are based on the same numeric
format.

Can we work around the problem? Sure, the most reliable method is to round the result with the help
of a method toFixed(n):

let sum = 0.1 + 0.2;


alert( sum.toFixed(2) ); // 0.30

Please note that toFixed always returns a string. It ensures that it has 2 digits after the decimal point.
That’s actually convenient if we have an e-shopping and need to show $0.30. For other cases, we can
use the unary plus to coerce it into a number:

let sum = 0.1 + 0.2;


alert( +sum.toFixed(2) ); // 0.3

We also can temporarily multiply the numbers by 100 (or a bigger number) to turn them into
integers, do the maths, and then divide back. Then, as we’re doing maths with integers, the error
somewhat decreases, but we still get it on division:

alert( (0.1 * 10 + 0.2 * 10) / 10 ); // 0.3


alert( (0.28 * 100 + 0.14 * 100) / 100); // 0.4200000000000001

So, multiply/divide approach reduces the error, but doesn’t remove it totally.

Sometimes we could try to evade fractions at all. Like if we’re dealing with a shop, then we can store
prices in cents instead of dollars. But what if we apply a discount of 30%? In practice, totally
evading fractions is rarely possible. Just round them to cut “tails” when needed.

The funny thing

Try running this:

// Hello! I'm a self-increasing number!


alert( 9999999999999999 ); // shows 10000000000000000

Lesson 3  jyercia
Objects : Advanced Data Types Page 9 of 26

This suffers from the same issue: a loss of precision. There are 64 bits for the number, 52 of them
can be used to store digits, but that’s not enough. So the least significant digits disappear.

JavaScript doesn’t trigger an error in such events. It does its best to fit the number into the desired
format, but unfortunately, this format is not big enough.

Two zeroes

Another funny consequence of the internal representation of numbers is the existence of two zeroes:
0 and -0.

That’s because a sign is represented by a single bit, so every number can be positive or negative,
including a zero.

In most cases the distinction is unnoticeable, because operators are suited to treat them as the same.

Tests: isFinite and isNaN

Remember these two special numeric values?

 Infinity (and -Infinity) is a special numeric value that is greater (less) than anything.
 NaN represents an error.

They belong to the type number, but are not “normal” numbers, so there are special functions to
check for them:

 isNaN(value) converts its argument to a number and then tests it for being NaN:
 alert( isNaN(NaN) ); // true
alert( isNaN("str") ); // true

But do we need this function? Can’t we just use the comparison === NaN? Sorry, but the answer is
no. The value NaN is unique in that it does not equal anything, including itself:

alert( NaN === NaN ); // false

isFinite(value) converts its argument to a number and returns true if it’s a regular number, not
NaN/Infinity/-Infinity:

alert( isFinite("15") ); // true


alert( isFinite("str") ); // false, because a special value: NaN
alert( isFinite(Infinity) ); // false, because a special value: Infinity

Sometimes isFinite is used to validate whether a string value is a regular number:

let num = +prompt("Enter a number", '');

// will be true unless you enter Infinity, -Infinity or not a number


alert( isFinite(num) );

Please note that an empty or a space-only string is treated as 0 in all numeric functions including
isFinite.

Lesson 3  jyercia
Objects : Advanced Data Types Page 10 of 26

Compare with Object.is

There is a special built-in method Object.is that compares values like ===, but is more reliable for
two edge cases:

1. It works with NaN: Object.is(NaN, NaN) === true, that’s a good thing.
2. Values 0 and -0 are different: Object.is(0, -0) === false, technically that’s true, because
internally the number has a sign bit that may be different even if all other bits are zeroes.

In all other cases, Object.is(a, b) is the same as a === b.

This way of comparison is often used in JavaScript specification. When an internal algorithm needs
to compare two values for being exactly the same, it uses Object.is (internally called SameValue).

parseInt and parseFloat

Numeric conversion using a plus + or Number() is strict. If a value is not exactly a number, it fails:

alert( +"100px" ); // NaN

The sole exception is spaces at the beginning or at the end of the string, as they are ignored.

But in real life we often have values in units, like "100px" or "12pt" in CSS. Also in many countries
the currency symbol goes after the amount, so we have "19€" and would like to extract a numeric
value out of that.

That’s what parseInt and parseFloat are for.

They “read” a number from a string until they can’t. In case of an error, the gathered number is
returned. The function parseInt returns an integer, whilst parseFloat will return a floating-point
number:

alert( parseInt('100px') ); // 100


alert( parseFloat('12.5em') ); // 12.5

alert( parseInt('12.3') ); // 12, only the integer part is returned


alert( parseFloat('12.3.4') ); // 12.3, the second point stops the reading

There are situations when parseInt/parseFloat will return NaN. It happens when no digits could be
read:

alert( parseInt('a123') ); // NaN, the first symbol stops the process

The second argument of parseInt(str, radix)

The parseInt() function has an optional second parameter. It specifies the base of the numeral system,
so parseInt can also parse strings of hex numbers, binary numbers and so on:

alert( parseInt('0xff', 16) ); // 255


alert( parseInt('ff', 16) ); // 255, without 0x also works

alert( parseInt('2n9c', 36) ); // 123456

Lesson 3  jyercia
Objects : Advanced Data Types Page 11 of 26

Other math functions

JavaScript has a built-in Math object which contains a small library of mathematical functions and
constants.

A few examples:

Math.random()

Returns a random number from 0 to 1 (not including 1)

alert( Math.random() ); // 0.1234567894322


alert( Math.random() ); // 0.5435252343232
alert( Math.random() ); // ... (any random numbers)
Math.max(a, b, c...) / Math.min(a, b, c...)

Returns the greatest/smallest from the arbitrary number of arguments.

alert( Math.max(3, 5, -10, 0, 1) ); // 5


alert( Math.min(1, 2) ); // 1
Math.pow(n, power)

Returns n raised the given power

alert( Math.pow(2, 10) ); // 2 in power 10 = 1024

There are more functions and constants in Math object, including trigonometry.

Summary

To write big numbers:

 Append "e" with the zeroes count to the number. Like: 123e6 is 123 with 6 zeroes.
 A negative number after "e" causes the number to be divided by 1 with given zeroes. That’s
for one-millionth or such.

For different numeral systems:

 Can write numbers directly in hex (0x), octal (0o) and binary (0b) systems
 parseInt(str, base) parses an integer from any numeral system with base: 2 ≤ base ≤ 36.
 num.toString(base) converts a number to a string in the numeral system with the given base.

For converting values like 12pt and 100px to a number:

 Use parseInt/parseFloat for the “soft” conversion, which reads a number from a string and
then returns the value they could read before the error.

For fractions:

 Round using Math.floor, Math.ceil, Math.trunc, Math.round or num.toFixed(precision).


 Make sure to remember there’s a loss of precision when working with fractions.

Lesson 3  jyercia
Objects : Advanced Data Types Page 12 of 26

More mathematical functions:

 See the Math object when you need them. The library is very small, but can cover basic
needs.

Tasks

Sum numbers from the visitor


importance: 5

Create a script that prompts the visitor to enter two numbers and then shows their sum.

Run the demo

P.S. There is a gotcha with types.

Why 6.35.toFixed(1) == 6.3?


importance: 4

According to the documentation Math.round and toFixed both round to the nearest number: 0..4 lead
down while 5..9 lead up.

For instance:

alert( 1.35.toFixed(1) ); // 1.4

In the similar example below, why is 6.35 rounded to 6.3, not 6.4?

alert( 6.35.toFixed(1) ); // 6.3

How to round 6.35 the right way?

1. Create a function readNumber which prompts for a number until the visitor enters a valid
numeric value.

The resulting value must be returned as a number.

The visitor can also stop the process by entering an empty line or pressing “CANCEL”. In that
case, the function should return null.

2. This loop is infinite. It never ends. Why?

let i = 0;
while (i != 10) {
i += 0.2;
}

3. The built-in function Math.random() creates a random value from 0 to 1 (not including 1).

Write the function random(min, max) to generate a random floating-point number from min to
max (not including max).

Lesson 3  jyercia
Objects : Advanced Data Types Page 13 of 26

Examples of its work:

alert( random(1, 5) ); // 1.2345623452


alert( random(1, 5) ); // 3.7894332423
alert( random(1, 5) ); // 4.3435234525

4. Create a function randomInteger(min, max) that generates a random integer number from min to
max including both min and max as possible values.

Any number from the interval min..max must appear with the same probability.

Examples of its work:

alert( randomInteger(1, 5) ); // 1
alert( randomInteger(1, 5) ); // 3
alert( randomInteger(1, 5) ); // 5

You can use the solution of the previous task as the base.

Lesson 3  jyercia
Objects : Advanced Data Types Page 14 of 26

Strings

In JavaScript, the textual data is stored as strings. There is no separate type for a single character.

The internal format for strings is always UTF-16, it is not tied to the page encoding.

Quotes

Let’s recall the kinds of quotes.

Strings can be enclosed within either single quotes, double quotes or backticks:

let single = 'single-quoted';


let double = "double-quoted";

let backticks = `backticks`;

Single and double quotes are essentially the same. Backticks, however, allow us to embed any
expression into the string, including function calls:

function sum(a, b) {
return a + b;
}

alert(`1 + 2 = ${sum(1, 2)}.`); // 1 + 2 = 3.

Another advantage of using backticks is that they allow a string to span multiple lines:

let guestList = `Guests:


* John
* Pete
* Mary
`;

alert(guestList); // a list of guests, multiple lines

If we try to use single or double quotes in the same way, there will be an error:

let guestList = "Guests: // Error: Unexpected token ILLEGAL


* John";

Single and double quotes come from ancient times of language creation when the need for multiline
strings was not taken into account. Backticks appeared much later and thus are more versatile.

Backticks also allow us to specify a “template function” before the first backtick. The syntax is:
func`string`. The function func is called automatically, receives the string and embedded expressions
and can process them. You can read more about it in the docs. This is called “tagged templates”. This
feature makes it easier to wrap strings into custom templating or other functionality, but it is rarely
used.

Special characters

Lesson 3  jyercia
Objects : Advanced Data Types Page 15 of 26

It is still possible to create multiline strings with single quotes by using a so-called “newline
character”, written as \n, which denotes a line break:

let guestList = "Guests:\n * John\n * Pete\n * Mary";

alert(guestList); // a multiline list of guests

For example, these two lines describe the same:

alert( "Hello\nWorld" ); // two lines using a "newline symbol"

// two lines using a normal newline and backticks


alert( `Hello
World` );

There are other, less common “special” characters as well. Here’s the list:

Examples with unicode:

alert( "\u00A9" ); // ©
alert( "\u{20331}" ); // 佫, a rare chinese hieroglyph (long unicode)
alert( "\u{1F60D}" ); // 😍, a smiling face symbol (another long unicode)

All special characters start with a backslash character \. It is also called an “escape character”.

We would also use it if we want to insert a quote into the string.

For instance:

alert( 'I\'m the Walrus!' ); // I'm the Walrus!

As you can see, we have to prepend the inner quote by the backslash \', because otherwise it would
indicate the string end.

Of course, that refers only to the quotes that are the same as the enclosing ones. So, as a more elegant
solution, we could switch to double quotes or backticks instead:

alert( `I'm the Walrus!` ); // I'm the Walrus!

Lesson 3  jyercia
Objects : Advanced Data Types Page 16 of 26

Note that the backslash \ serves for the correct reading of the string by JavaScript, then disappears.
The in-memory string has no \. You can clearly see that in alert from the examples above.

But what if we need to show an actual backslash \ within the string?

That’s possible, but we need to double it like \\:

alert( `The backslash: \\` ); // The backslash: \

String length

The length property has the string length:

alert( `My\n`.length ); // 3

Note that \n is a single “special” character, so the length is indeed 3.

length is a property

People with a background in some other languages sometimes mistype by calling str.length() instead
of just str.length. That doesn’t work.

Please note that str.length is a numeric property, not a function. There is no need to add parenthesis
after it.

Accessing characters

To get a character at position pos, use square brackets [pos] or call the method str.charAt(pos). The
first character starts from the zero position:

let str = `Hello`;

// the first character


alert( str[0] ); // H
alert( str.charAt(0) ); // H

// the last character


alert( str[str.length - 1] ); // o

The square brackets are a modern way of getting a character, while charAt exists mostly for
historical reasons.

The only difference between them is that if no character is found, [] returns undefined, and charAt
returns an empty string:

let str = `Hello`;

alert( str[1000] ); // undefined


alert( str.charAt(1000) ); // '' (an empty string)

We can also iterate over characters using for..of:

for (let char of "Hello") {


Lesson 3  jyercia
Objects : Advanced Data Types Page 17 of 26

alert(char); // H,e,l,l,o (char becomes "H", then "e", then "l" etc)
}

Strings are immutable

Strings can’t be changed in JavaScript. It is impossible to change a character.

Let’s try it to show that it doesn’t work:

let str = 'Hi';

str[0] = 'h'; // error


alert( str[0] ); // doesn't work

The usual workaround is to create a whole new string and assign it to str instead of the old one.

For instance:

let str = 'Hi';

str = 'h' + str[1]; // replace the string


alert( str ); // hi

In the following sections we’ll see more examples of this.

Changing the case

Methods toLowerCase() and toUpperCase() change the case:

alert( 'Interface'.toUpperCase() ); // INTERFACE


alert( 'Interface'.toLowerCase() ); // interface

Or, if we want a single character lowercased:

alert( 'Interface'[0].toLowerCase() ); // 'i'

Searching for a substring

There are multiple ways to look for a substring within a string.

str.indexOf

The first method is str.indexOf(substr, pos).

It looks for the substr in str, starting from the given position pos, and returns the position where the
match was found or -1 if nothing can be found.

For instance:

let str = 'Widget with id';

alert( str.indexOf('Widget') ); // 0, because 'Widget' is found at the beginning


alert( str.indexOf('widget') ); // -1, not found, the search is case-sensitive
Lesson 3  jyercia
Objects : Advanced Data Types Page 18 of 26

alert( str.indexOf("id") ); // 1, "id" is found at the position 1 (..idget with id)

The optional second parameter allows us to search starting from the given position.

For instance, the first occurrence of "id" is at position 1. To look for the next occurrence, let’s start
the search from position 2:

let str = 'Widget with id';

alert( str.indexOf('id', 2) ) // 12

If we’re interested in all occurrences, we can run indexOf in a loop. Every new call is made with the
position after the previous match:

let str = 'As sly as a fox, as strong as an ox';

let target = 'as'; // let's look for it

let pos = 0;
while (true) {
let foundPos = str.indexOf(target, pos);
if (foundPos == -1) break;

alert( `Found at ${foundPos}` );


pos = foundPos + 1; // continue the search from the next position
}

The same algorithm can be layed out shorter:

let str = "As sly as a fox, as strong as an ox";


let target = "as";

let pos = -1;


while ((pos = str.indexOf(target, pos + 1)) != -1) {
alert( pos );
}
str.lastIndexOf(substr, position)

There is also a similar method str.lastIndexOf(substr, position) that searches from the end of a string
to its beginning.

It would list the occurrences in the reverse order.

There is a slight inconvenience with indexOf in the if test. We can’t put it in the if like this:

let str = "Widget with id";

if (str.indexOf("Widget")) {
alert("We found it"); // doesn't work!
}

Lesson 3  jyercia
Objects : Advanced Data Types Page 19 of 26

The alert in the example above doesn’t show because str.indexOf("Widget") returns 0 (meaning that
it found the match at the starting position). Right, but if considers 0 to be false.

So, we should actually check for -1, like this:

let str = "Widget with id";

if (str.indexOf("Widget") != -1) {
alert("We found it"); // works now!
}

The bitwise NOT trick

One of the old tricks used here is the bitwise NOT ~ operator. It converts the number to a 32-bit
integer (removes the decimal part if exists) and then reverses all bits in its binary representation.

For 32-bit integers the call ~n means exactly the same as -(n+1) (due to IEEE-754 format).

For instance:

alert( ~2 ); // -3, the same as -(2+1)


alert( ~1 ); // -2, the same as -(1+1)
alert( ~0 ); // -1, the same as -(0+1)
alert( ~-1 ); // 0, the same as -(-1+1)

As we can see, ~n is zero only if n == -1.

So, the test if ( ~str.indexOf("...") ) is truthy that the result of indexOf is not -1. In other words, when
there is a match.

People use it to shorten indexOf checks:

let str = "Widget";

if (~str.indexOf("Widget")) {
alert( 'Found it!' ); // works
}

It is usually not recommended to use language features in a non-obvious way, but this particular trick
is widely used in old code, so we should understand it.

Just remember: if (~str.indexOf(...)) reads as “if found”.

includes, startsWith, endsWith

The more modern method str.includes(substr, pos) returns true/false depending on whether str
contains substr within.

It’s the right choice if we need to test for the match, but don’t need its position:

alert( "Widget with id".includes("Widget") ); // true

alert( "Hello".includes("Bye") ); // false


Lesson 3  jyercia
Objects : Advanced Data Types Page 20 of 26

The optional second argument of str.includes is the position to start searching from:

alert( "Midget".includes("id") ); // true


alert( "Midget".includes("id", 3) ); // false, from position 3 there is no "id"
The methods str.startsWith and str.endsWith do exactly what they say:
alert( "Widget".startsWith("Wid") ); // true, "Widget" starts with "Wid"
alert( "Widget".endsWith("get") ); // true, "Widget" ends with "get"

Getting a substring

There are 3 methods in JavaScript to get a substring: substring, substr and slice.

str.slice(start [, end])

Returns the part of the string from start to (but not including) end.

For instance:

let str = "stringify";


alert( str.slice(0, 5) ); // 'strin', the substring from 0 to 5 (not including 5)
alert( str.slice(0, 1) ); // 's', from 0 to 1, but not including 1, so only character at 0

If there is no second argument, then slice goes till the end of the string:

let str = "stringify";


alert( str.slice(2) ); // ringify, from the 2nd position till the end

Negative values for start/end are also possible. They mean the position is counted from the string
end:

let str = "stringify";

// start at the 4th position from the right, end at the 1st from the right
alert( str.slice(-4, -1) ); // gif
str.substring(start [, end])

Returns the part of the string between start and end.

This is almost the same as slice, but it allows start to be greater than end.

For instance:

let str = "stringify";

// these are same for substring


alert( str.substring(2, 6) ); // "ring"
alert( str.substring(6, 2) ); // "ring"

// ...but not for slice:


alert( str.slice(2, 6) ); // "ring" (the same)
alert( str.slice(6, 2) ); // "" (an empty string)

Negative arguments are (unlike slice) not supported, they are treated as 0.
Lesson 3  jyercia
Objects : Advanced Data Types Page 21 of 26

str.substr(start [, length])

Returns the part of the string from start, with the given length.

In contrast with the previous methods, this one allows us to specify the length instead of the
ending position:

let str = "stringify";


alert( str.substr(2, 4) ); // ring, from the 2nd position get 4 characters

The first argument may be negative, to count from the end:

let str = "stringify";


alert( str.substr(-4, 2) ); // gi, from the 4th position get 2 characters

Let’s recap these methods to avoid any confusion:

Which one to choose?

All of them can do the job. Formally, substr has a minor drawback: it is described not in the core
JavaScript specification, but in Annex B, which covers browser-only features that exist mainly for
historical reasons. So, non-browser environments may fail to support it. But in practice it works
everywhere.

The author finds themself using slice almost all the time.

Comparing strings

As we know from the chapter Comparisons, strings are compared character-by-character in


alphabetical order.

Although, there are some oddities.

1. A lowercase letter is always greater than the uppercase:

alert( 'a' > 'Z' ); // true

Letters with diacritical marks are “out of order”:

alert( 'Österreich' > 'Zealand' ); // true

2. This may lead to strange results if we sort these country names. Usually people would expect
Zealand to come after Österreich in the list.

Lesson 3  jyercia
Objects : Advanced Data Types Page 22 of 26

To understand what happens, let’s review the internal representation of strings in JavaScript.

All strings are encoded using UTF-16. That is: each character has a corresponding numeric code.
There are special methods that allow to get the character for the code and back.

str.codePointAt(pos)

Returns the code for the character at position pos:

// different case letters have different codes


alert( "z".codePointAt(0) ); // 122
alert( "Z".codePointAt(0) ); // 90

String.fromCodePoint(code)

Creates a character by its numeric code

alert( String.fromCodePoint(90) ); // Z

We can also add unicode characters by their codes using \u followed by the hex code:

// 90 is 5a in hexadecimal system
alert( '\u005a' ); // Z

Now let’s see the characters with codes 65..220 (the latin alphabet and a little bit extra) by making a
string of them:

let str = '';

for (let i = 65; i <= 220; i++) {


str += String.fromCodePoint(i);
}
alert( str );
// ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„
// ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜ

See? Capital characters go first, then a few special ones, then lowercase characters.

Now it becomes obvious why a > Z.

The characters are compared by their numeric code. The greater code means that the character is
greater. The code for a (97) is greater than the code for Z (90).

 All lowercase letters go after uppercase letters because their codes are greater.
 Some letters like Ö stand apart from the main alphabet. Here, it’s code is greater than
anything from a to z.

Correct comparisons

The “right” algorithm to do string comparisons is more complex than it may seem, because alphabets
are different for different languages. The same-looking letter may be located differently in different
alphabets.

Lesson 3  jyercia
Objects : Advanced Data Types Page 23 of 26

So, the browser needs to know the language to compare.

Luckily, all modern browsers (IE10- requires the additional library Intl.JS) support the
internationalization standard ECMA 402.

It provides a special method to compare strings in different languages, following their rules.

The call str.localeCompare(str2):

 Returns 1 if str is greater than str2 according to the language rules.


 Returns -1 if str is less than str2.
 Returns 0 if they are equal.

For instance:

alert( 'Österreich'.localeCompare('Zealand') ); // -1

This method actually has two additional arguments specified in the documentation, which allows it
to specify the language (by default taken from the environment) and setup additional rules like case
sensitivity or should "a" and "á" be treated as the same etc.

Internals, Unicode

Advanced knowledge

The section goes deeper into string internals. This knowledge will be useful for you if you plan to
deal with emoji, rare mathematical or hieroglyphic characters or other rare symbols.

You can skip the section if you don’t plan to support them.

Surrogate pairs

Most symbols have a 2-byte code. Letters in most european languages, numbers, and even most
hieroglyphs, have a 2-byte representation.

But 2 bytes only allow 65536 combinations and that’s not enough for every possible symbol. So rare
symbols are encoded with a pair of 2-byte characters called “a surrogate pair”.

The length of such symbols is 2:

alert( '𝒳'.length ); // 2, MATHEMATICAL SCRIPT CAPITAL X


alert( '😂'.length ); // 2, FACE WITH TEARS OF JOY
alert( '𩷶'.length ); // 2, a rare chinese hieroglyph

Note that surrogate pairs did not exist at the time when JavaScript was created, and thus are not
correctly processed by the language!

We actually have a single symbol in each of the strings above, but the length shows a length of 2.

String.fromCodePoint and str.codePointAt are few rare methods that deal with surrogate pairs right.
They recently appeared in the language. Before them, there were only String.fromCharCode and

Lesson 3  jyercia
Objects : Advanced Data Types Page 24 of 26

str.charCodeAt. These methods are actually the same as fromCodePoint/codePointAt, but don’t work
with surrogate pairs.

But, for instance, getting a symbol can be tricky, because surrogate pairs are treated as two
characters:

alert( '𝒳'[0] ); // strange symbols...


alert( '𝒳'[1] ); // ...pieces of the surrogate pair

Note that pieces of the surrogate pair have no meaning without each other. So the alerts in the
example above actually display garbage.

Technically, surrogate pairs are also detectable by their codes: if a character has the code in the
interval of 0xd800..0xdbff, then it is the first part of the surrogate pair. The next character (second
part) must have the code in interval 0xdc00..0xdfff. These intervals are reserved exclusively for
surrogate pairs by the standard.

In the case above:

// charCodeAt is not surrogate-pair aware, so it gives codes for parts

alert( '𝒳'.charCodeAt(0).toString(16) ); // d835, between 0xd800 and 0xdbff


alert( '𝒳'.charCodeAt(1).toString(16) ); // dcb3, between 0xdc00 and 0xdfff

You will find more ways to deal with surrogate pairs later in the chapter Iterables. There are
probably special libraries for that too, but nothing famous enough to suggest here.

Diacritical marks and normalization

In many languages there are symbols that are composed of the base character with a mark
above/under it.

For instance, the letter a can be the base character for: àáâäãåā. Most common “composite” character
have their own code in the UTF-16 table. But not all of them, because there are too many possible
combinations.

To support arbitrary compositions, UTF-16 allows us to use several unicode characters. The base
character and one or many “mark” characters that “decorate” it.

For instance, if we have S followed by the special “dot above” character (code \u0307), it is shown
as Ṡ.

alert( 'S\u0307' ); // Ṡ

If we need an additional mark above the letter (or below it) – no problem, just add the necessary
mark character.

For instance, if we append a character “dot below” (code \u0323), then we’ll have “S with dots
above and below”: Ṩ.

For example:

Lesson 3  jyercia
Objects : Advanced Data Types Page 25 of 26

alert( 'S\u0307\u0323' ); // Ṩ

This provides great flexibility, but also an interesting problem: two characters may visually look the
same, but be represented with different unicode compositions.

For instance:

alert( 'S\u0307\u0323' ); // Ṩ, S + dot above + dot below


alert( 'S\u0323\u0307' ); // Ṩ, S + dot below + dot above

alert( 'S\u0307\u0323' == 'S\u0323\u0307' ); // false

To solve this, there exists a “unicode normalization” algorithm that brings each string to the single
“normal” form.

It is implemented by str.normalize().

alert( "S\u0307\u0323".normalize() == "S\u0323\u0307".normalize() ); // true

It’s funny that in our situation normalize() actually brings together a sequence of 3 characters to one:
\u1e68 (S with two dots).

alert( "S\u0307\u0323".normalize().length ); // 1

alert( "S\u0307\u0323".normalize() == "\u1e68" ); // true

In reality, this is not always the case. The reason being that the symbol Ṩ is “common enough”, so
UTF-16 creators included it in the main table and gave it the code.

Summary

 There are 3 types of quotes. Backticks allow a string to span multiple lines and embed
expressions.
 Strings in JavaScript are encoded using UTF-16.
 We can use special characters like \n and insert letters by their unicode using \u....
 To get a character, use: [].
 To get a substring, use: slice or substring.
 To lowercase/uppercase a string, use: toLowerCase/toUpperCase.
 To look for a substring, use: indexOf, or includes/startsWith/endsWith for simple checks.
 To compare strings according to the language, use: localeCompare, otherwise they are
compared by character codes.

There are several other helpful methods in strings:

 str.trim() – removes (“trims”) spaces from the beginning and end of the string.
 str.repeat(n) – repeats the string n times.
 …and more. See the manual for details.

Strings also have methods for doing search/replace with regular expressions. But that topic deserves
a separate chapter, so we’ll return to that later.

Tasks
Lesson 3  jyercia
Objects : Advanced Data Types Page 26 of 26

1. Write a function ucFirst(str) that returns the string str with the uppercased first character, for
instance:

ucFirst("john") == "John";

2. Write a function checkSpam(str) that returns true if str contains ‘viagra’ or ‘XXX’, otherwise
false.

The function must be case-insensitive:

checkSpam('buy ViAgRA now') == true


checkSpam('free xxxxx') == true
checkSpam("innocent rabbit") == false

3. Create a function truncate(str, maxlength) that checks the length of the str and, if it exceeds
maxlength – replaces the end of str with the ellipsis character "…", to make its length equal to
maxlength.

The result of the function should be the truncated (if needed) string.

For instance:

truncate("What I'd like to tell on this topic is:", 20) = "What I'd like to te…"

truncate("Hi everyone!", 20) = "Hi everyone!"

4. We have a cost in the form "$120". That is: the dollar sign goes first, and then the number.

Create a function extractCurrencyValue(str) that would extract the numeric value from such
string and return it.

The example:

alert( extractCurrencyValue('$120') === 120 ); // true

Lesson 3  jyercia

You might also like