PHP: Basic syntax, arrays, loops, functions

(unfinished!)

Conditional control structures

In the previous session, we discussed the simplest possible program capable of doing useful work—a simple clone of the UNIX utility cat, which reads lines from input and prints them out. The second simplest program might be a riff on UNIX grep: a program that reads lines in from input, then prints them out only if certain conditions are met. We’ll call this kind of program a filter. The next example is just such a program (simplefilter.php):

<?php

while ($line = fgets(STDIN)) {
  $line = rtrim($line);
  $length = strlen($line);
  if ($length < 10) {
    echo "$line\n";
  }
}

This program is similar in structure to simplecat.php. The strlen function, as you might imagine, returns the length of a string. We assign that value to a variable ($length) and then use an if statement to see if the value meets some criterion—in this case, whether it’s less than eight. If so, we print the line. Here’s an example transcript of using the program from the command line (first printing the unfiltered text):

$ cat <this_is_just.txt 
this is just to say

i have eaten
the plums
that were in
the icebox

and which
you were probably
saving
for breakfast

forgive me
they were delicious
so sweet
and so cold

$ php simplefilter.php <this_is_just.txt 

the plums

and which
saving

so sweet

$

As you can see, simplefilter.php prints out only those lines shorter than ten characters (giving us a different reading of the original text).

PHP has many conditional operators, many of which function in a manner similar or identical to their C/C++ counterparts. Here’s a list. The operators you’ll see most often are == for “is equal to” and != for “is not equal to.”

Else and elseif

Just as in C, C++, Java and Perl, if statements can be accompanied by else if and else statements. PHP recognizes the keyword elseif, which is more or less identical to else if (more details here).

The general schema:

if (condition1) {
  // statements
}
elseif (condition2) {
  // further statements
}
else {
  // even more statements
}

Zero or more elseifs can be present, and the else block can, of course, be omitted.

Mini exercise 1. Modify simplefilter.php above to print “LONG” for each line that is longer than ten characters and “SHORT” for each line that is shorter than ten characters. Lines of length zero should print “NONE”. Bonus: What does the output of your program say in Morse code?

Comparison and types

PHP is as loosely typed as a language can get. If you assign a string to a variable, the variable becomes a string; if you assign an integer to a variable, the variable becomes an integer. You don’t get to choose what the type of your variable is. Some consider this to be one of the language’s strengths, and when everything’s going just right in PHP, you hardly have to think about types at all.

One benefit of PHP’s loose typing is that you don’t have to worry about converting between strings and integers in order to compare the two: 5 == "5" evaluates to true, for example. The number 0, the string "0", the empty string, and the boolean constant false and the special constant null all evaluate to false, while a non-empty string (other than "0"), a non-zero integer, and the constant true all evaluate to true.

All things being equal

But there are drawbacks. Although PHP internally represents values as belonging to different types (integers, strings, booleans, etc.), it will blithely allow you to compare and operate on values of differing types, sometimes with baffling results. Here’s more on PHP’s types, and what happens when types mix.

One issue worthy of specific note is comparisons on equality. PHP actually has two equality operators: == and ===. The former checks to see if two values are equal (“loose”); the latter checks to see if two values are equal and of the same type (“strict”).

You’ll see the “strict” equality operator mainly for functions whose return values distinguish between false and the number 0, like strpos. A call to strpos will return either the integer index of a substring in a given string, or false if the substring wasn’t found. For example, if you wanted to test to see if a string $test begins with the letter ‘A’, the following code would not work:

$test = "All good boys deserve PHP";
if (strpos($test, "A") == 0) {
  echo "Begins with 'A'\n";
}

This code wouldn’t work because the condition in line 2 would succeed even if A weren’t found in the string—false and integer 0 are “loosely” equal.

If you replace the == with ===, the code behaves as desired:

$test = "All good boys deserve PHP";
if (strpos($test, "A") === 0) {
  echo "Begins with 'A'\n";
}

Here’s PHP type comparison table to help you figure out how the behavior of the two equality operators work, given values of different types.

PHP syntax: a few more details

  • Variable names must begin with $ followed by a letter or underscore, followed by any number of letters, underscores, or numbers.
  • Case sensitivity: functions and class names in PHP are not case sensitive. This applies to built-in functions and user-defined functions: both strlen() and StRLEn() call the same function. Variable names, on the other hand, are case sensitive.
  • Whitespace: PHP doesn’t care about your whitespace. It’s up to you to keep your code clean.
  • Conditional statements and loops with only one line are permitted to omit curly braces. (e.g. if ($foo) { echo "foo"; } and if ($foo) echo "foo"; behave the same)
  • You can assign to a variable without declaring it, but using a variable before you’ve assigned anything to it will generate a warning.
Comments

You can put comments in your code in PHP in many different ways:

// C++ style: this is a comment
/* C style: this is also a comment */
# Perl/shell script style: from pound to end of line is a comment

Arrays and arithmetic: crunching numbers with PHP

Next up, stats.php, which reads in comma-separated data from a text file (nbastats2009.txt, a slightly modified version of data available from Doug Stats). The data concern the statistics totals for National Basketball Association players in the 2009 season. This program totals up various statistics from the file and prints out some interesting information.

Here’s the source code.

<?php

$total_score = 0;
$total_players = 0;
$highest_fgm = 0;
$highest_player = "";

fgets(STDIN); // discard first line
while ($line = fgets(STDIN)) {
  $stats = explode(",", $line);
  if ($stats[4] < 10) continue;
  $total_score += $stats[20];
  $total_players++;
  if ($stats[6] > $highest_fgm) {
    $highest_fgm = $stats[6];
    $highest_player = $stats[1] . " " . $stats[0];
  }
}

$league_average = $total_score / $total_players;

echo "Total score: $total_score\n";
echo "Total players: $total_players\n";
echo "League average: $league_average\n";
echo "Player with most field goals: $highest_player ($highest_fgm)\n";

This is our first PHP program to make use of an array. The array is one of PHP’s compound data types: a variable that can store multiple values. There are syntactic similarities between PHP’s array and (e.g.) arrays in C, C++, and Java; unlike arrays in those languages, however, arrays in PHP don’t have a predefined size and can contain values of different types.

The array in question here is $stats, which is returned from a call to explode(). The explode function takes a string and breaks it into smaller strings, using the given parameter (in this case, ,) as a delimiter. Each line of the input file looks like this:

williams,deron,uta,PG,68,2507,463,984,70,226,326,384,24,195,725,73,227,20,134,2,1322,9,0,0,68

So on each iteration of the loop, $stats is an array with twenty five elements, each corresponding to one of the given player’s attributes or statistics. (The first line of the file tells which columns mean what. The presence of this line is the reason we have to read and discard a line of input in line 7 of the code above.)

Arrays with numerical indices can be dereferenced using a familiar syntax, as illustrated in the code above. Use brackets with an integer index inside them (e.g, [4]) to access the corresponding element of the array. (PHP array indices are zero-based.)

Hello operator

The code above also demonstrates a number of PHP’s arithmetic and assignment operators. For C, C++ and Java programmers, these should look familiar, and indeed in PHP these operators work in a very similar way (right down to precedence). More details here.

Note, in particular, the use of the increment operator ++. Works in PHP just like in C and C++.

One important exception that you might notice in the example above is that PHP’s division operator does floating-point division, not integer division (as you might expect). (You can fake integer division by sending the result of a division to the intval() function.

Loop control

On line 10 in the code above, you’ll notice the continue statement. This statement tells PHP to stop the current execution of the loop immediately and move on to the next iteration of the loop. (Just like continue in C.) In this case, we’re using this statement to exclude players who played fewer than ten games in the season, as this might throw off the calculations of other statistics.

PHP has another loop control statement, break, which ceases execution of the loop immediately, foregoing all subsequent iterations.

Mini exercise 2. Modify the above code to track other information, like the total number of assists (AS) for all players, or the average number of blocks (BK) among shooting guards (whose position PS is SG).

Arrays, strings, and foreach

Swerving back toward text analysis, the next example program, tokens.php, shows a new kind of loop: foreach. The purpose of the program is to serve as a basic clone for the UNIX utility wc: it prints out all words (“tokens”) in the given input and displays the total number of words it found. Here’s the code:

<?php

$all_tokens = array(); // create empty array

while ($line = fgets(STDIN)) {
	$line = rtrim($line);
	$tokens = explode(" ", $line);
	foreach ($tokens as $t) {
		$all_tokens[] = $t;
	}
}

foreach ($all_tokens as $t) {
	echo "$t\n";
}
echo "Total tokens: " . count($all_tokens) . "\n";

Unlike the other programs we’ve worked on, this program doesn’t just work on each line of input in sequence. Instead, it builds up a data structure, which it then uses to produce output after all of the lines of input have been read.

The first new thing is the use of the array() construct on line 2. This is how you create an empty array in PHP. The array thusly created, $all_tokens, will eventually contain all of the words in the input.

We’re also seeing a new kind of loop: the foreach loop. Here’s the structure of a foreach loop, in its simplest form:

foreach ($an_array as $loop_val) {
  // code inside the loop will execute once for each element of $an_array,
  // with $loop_val set to each successive value in the array
  echo $loop_val; // or something else entirely
}

Basically, foreach provides a means to iterate over every item of an array, executing the same code for each item. It’s the preferred idiom for iterating over arrays in PHP—it’s likely the most common loop in PHP by a margin. You’re gonna see it all over the place.

Appending and counting

So what’s up with the strange syntax on line 9 ($all_tokens[] = $t;)? This is a bit of syntax in PHP that lets you append items to the end of the array. Some simpler example code:

$test_array = array();
$test_array[] = 1;
$test_array[] = 2;
$test_array[] = 3;
// $test_array now contains three values: 1, 2, 3
echo $test_array[1];
// would print 2

Because arrays in PHP don’t have a fixed size, you’re always going to wonder: how many items are in my array? To help answer this question, PHP provides the handy function count(), which takes an array as an argument and returns the number of elements in the array.

For traditionalists

PHP also supports more C-like for loops, which are handy when you absolutely need to know the numerical index inside the loop. The two loops below produce the same output:

$test_array = array(1, 2, 3, 4, 5);

foreach ($test_array as $val) {
  echo "$val\n";
}

for ($i = 0; $i < count($test_array); $i++) {
  echo $test_array[$i];
}

Associative arrays

It turns out that PHP arrays are actually a good deal more powerful than their most obvious counterparts in C, C++ and Java. That’s because PHP arrays aren’t arrays at all—they’re actually a data structure called an “ordered map.” In practical terms, this means that your PHP arrays can have either numbers or strings as indices. In PHP, arrays serve not just as lists, but as maps as well (think C++’s STL “map” class or Java’s “HashMap”).

String indices in arrays are usually called “keys.” You might associate a key with a value in an array with code like the following:

$test_array = array();
$test_array["foo"] = "bar";
echo $test_array["foo"];
// would print "bar"

The next example, concordance.php builds on tokens.php, but performs one more task: it counts the number of times each token occurs. (The term “concordance” is used here in this sense.) We keep track of that information by building a data structure in memory as we process the lines in the file. This data structure maps the words in the input—the key—to the number of times that key occurs—the value. Here’s what it looks like:

<?php

$words = array();

while ($line = fgets(STDIN)) {
  $line = rtrim($line);
  $tokens = explode(" ", $line);
  foreach ($tokens as $t) {
    if (array_key_exists($t, $words)) {
      $words[$t]++;
    }
    else {
      $words[$t] = 1;
    }
  }
}

foreach ($words as $word => $count) {
  echo "$word: $count\n";
}

The big difference between this example and tokens.php is what we’re doing Inside the foreach loop on line 8. First, we check to see if $words already contains a key for the current token. If it does, we increment the value associated with that key—the word has occurred at least once before, and we’re incrementing the number of times it has occurred. If it doesn’t, then we assign the integer value 1 to the key—this is the first time we’ve seen the word.

You can iterate over arrays with keys just like you can iterate over arrays with numeric indices, using foreach. There’s an alternate syntax for foreach when you’re using an array with keys, as seen on line 18. Here’s a schematic example:

$test_array["foo"] = "bar";
$test_array["baz"] = "quux";
$test_array["xyzzy"] = "ducks";
foreach ($test_array as $key => $value) {
  // this code will be run for each key/value pair.
  // the current key is available inside the loop as $key, and the current value as $value.
  echo "$key: $value\n";
}
/* would print out:
foo: bar
baz: quux
xyzzy: ducks
*/

Mini-exercise time! Modify the concordance example above to be a true concordance—i.e., store not just a count of how many times each word occurs, but an array containing all lines in which each word occurs.

Array literals; or, Of Eels and Bread

Sometimes we want to include the data in an array inside our program, instead of reading it in from input. For that purpose, we have array literals. In PHP, they look like this:

$x = array(1, 2, 3, 4, 5);
// $x is a numerically-indexed array with five elements
$y = array("foo"=>"bar", "baz"=>"quux", "xyzzy"=>"plugh");
// $y is an index with three key/value pairs

Here’s an example program, replace.php, which uses this syntax. This program will find strings in the input that match any of the keys in $replacements and replace them with the value for that key:

<?php

$replacements = array(
  'miles' => 'ducks',
  'snow' => 'bread',
  'woods' => 'eels',
  'I' => 'Captain Monkeypants',
  'and' => 'or',
  'house' => 'dirty mind'
);

while ($line = fgets(STDIN)) {
  $line = rtrim($line);
  foreach ($replacements as $word => $replace) {
    $line = str_replace($word, $replace, $line);
  }
  echo "$line\n";
}

What’s new: the call to the str_replace() function on line 14 replaces occurrences of $word (the word to be replaced) with $replace (the replacement string) in $line (the current line). Here’s sample output from frost.txt:

Stopping By Woods On A Snowy Evening

Whose eels these are Captain Monkeypants think Captain Monkeypants know.
His dirty mind is in the village though;
He will not see me stopping here
To watch his eels fill up with bread.
My little horse must think it queer
To stop without a farmdirty mind near
Between the eels or frozen lake
The darkest evening of the year.
He gives his harness bells a shake
To ask if there is some mistake.
The only other sound's the sweep
Of easy wind or downy flake.
The eels are lovely, dark or deep.
But Captain Monkeypants have promises to keep,
And ducks to go before Captain Monkeypants sleep,
And ducks to go before Captain Monkeypants sleep.

User-defined functions

It’s easy to define your own functions in PHP. The general syntax looks something like this:

function your_func($param1, $param2, $param3) {
  // do something with your parameters
  return $value; // this is optional, functions don't need to return a value
}

Function definitions in PHP are simple compared to their analogues in C, C++ and Java. You don’t have to specify a return type, or give types to the parameters. Your function can take any number of arguments (zero or more), and can optionally return a value.

The following example, replace_rand.php, is similar to replace.php above, except that it replaces strings it finds with a string randomly chosen from one of a number of alternatives. We use a few functions to help compartmentalize and clarify the code. Here’s the source:

<?php
$replacements = array(
  'miles' => array('ducks', 'cheeses', 'roller skates'),
  'snow' => array('bread', 'furniture', 'mutton'),
  'woods' => array('eels', 'liqueurs', 'typefaces'),
  'I' => array('Captain Monkeypants', 'you', 'all of us'),
  'and' => array('or', 'but', 'whether or not'),
  'house' => array('dirty mind', 'ostrich', 'elephant')
);

while ($line = fgets(STDIN)) {
  $line = rtrim($line);
  foreach ($replacements as $word => $replace_list) {
    $line = random_replace($word, $replace_list, $line);
  }
  echo "$line\n";
}

function random_replace($to_replace, $choices, $source) {
  $replaced = str_replace($to_replace, choice($choices), $source);
  return $replaced;
}

function choice($in) {
  return $in[rand(0, count($in) - 1)];
}

  • random_replace() is a function defined here to behave exactly like the built-in function str_replace, except it selects its replacement randomly from the specified array.
  • choice() takes an array as a parameter, and returns a random item from the array. (Note the use of built-in function rand().
  • Note the arrays within arrays specified in $replacements!

Here’s some sample output, again using frost.txt as a source:

Stopping By Woods On A Snowy Evening

Whose typefaces these are you think you know.
His ostrich is in the village though;
He will not see me stopping here
To watch his liqueurs fill up with mutton.
My little horse must think it queer
To stop without a farmdirty mind near
Between the liqueurs whether or not frozen lake
The darkest evening of the year.
He gives his harness bells a shake
To ask if there is some mistake.
The only other sound's the sweep
Of easy wind whether or not downy flake.
The eels are lovely, dark but deep.
But Captain Monkeypants have promises to keep,
And roller skates to go before you sleep,
And roller skates to go before Captain Monkeypants sleep.

Another mini-exercise. Notice how “house” gets replaced in the above output, even when it’s a part of a larger word (“farmhouse”). Write a version of this program without this problem, i.e., where only whole words get replaced, not words that simply a substring.

Further reading

Reply