Thursday Nov 19, 2009

PHP and sessions: Very simple to use, but not as simple to understand as we might want to think.

session.gc_maxlifetime

This value (default 1440 seconds) defines how long an unused PHP session will be kept alive. For example: A user logs in, browses through your application or web site, for hours, for days. No problem. As long as the time between his clicks never exceed 1440 seconds. It's a timeout value.

PHP's session garbage collector runs with a probability defined by session.gc_probability divided by session.gc_divisor. By default this is 1/100, which means that above timeout value is checked with a probability of 1 in 100.

session.cookie_lifetime

This value (default 0, which means until the browser's next restart) defines how long (in seconds) a session cookie will live. Sounds similar to session.gc_maxlifetime, but it's a completely different approach. This value indirectly defines the "absolute" maximum lifetime of a session, whether the user is active or not. If this value is set to 60, every session ends after an hour.

Wednesday Nov 18, 2009

There are a lot of tutorial out there describing how to use PHP's classic MySQL extension to store and retrieve blobs. There are also many tutorials how to use PHP's MySQLi extension to use prepared statements to fight SQL injections in your web application. But there are no tutorials about using MySQLi with any blob data at all.

Until today... ;)

Preparing the database

Okay, first I need a table to store my blobs. In this example I'll store images in my database because images usually look better in a tutorial than some random raw data.

mysql> CREATE TABLE images (
       id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
       image MEDIUMBLOB NOT NULL,
       PRIMARY KEY (id)
       );
Query OK, 0 rows affected (0.02 sec)

In general you don't want to store images in a relational database. But that's another discussion for another day.

Storing the blob

To make a long story short, here's the code to store a blob using MySQLi:

<?php
	$mysqli=mysqli_connect('localhost','user','password','db');

	if (!$mysqli)
		die("Can't connect to MySQL: ".mysqli_connect_error());

	$stmt = $mysqli->prepare("INSERT INTO images (image) VALUES(?)");
	$null = NULL;
	$stmt->bind_param("b", $null);

	$stmt->send_long_data(0, file_get_contents("osaka.jpg"));

	$stmt->execute();
?>

If you already used MySQLi, most of the above should look familiar to you. I highlighted two pieces of code, which I think are worth looking at:

  1. The $null variable is needed, because bind_param() always wants a variable reference for a given parameters. In this case the "b" (as in blob) parameter. So $null is just a dummy, to make the syntax work.

  2. In the next step I need to "fill" my blob parameter with the actual data. This is done by send_long_data(). The first parameter of this method indicates which parameter to associate the data with. Parameters are numbered beginning with 0. The second parameter of send_long_data() contains the actual data to be stored.

While using send_long_data(), please make sure that the blob isn't bigger than MySQL's max_allowed_packet:

mysql> SHOW VARIABLES LIKE 'max_allowed_packet';
+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| max_allowed_packet | 16776192 | 
+--------------------+----------+
1 row in set (0.00 sec)

If your data exceeds max_allowed_packet, you probably don't get any errors returned from send_long_data() or execute(). The saved blob is just corrupt!

Simply raise the value max_allowed_packet to whatever you'll need. If you're not able to change MySQL's configuration, you'll need to send the data in smaller chunks:

	$fp = fopen("osaka.jpg", "r");
	while (!feof($fp)) 
	{
 	   $stmt->send_long_data(0, fread($fp, 16776192));
	}

Usually the default value of 16M should be a good start.

Retrieving the blob

Getting the blob data out of the database is quite simple and follows the usual way of MySQLi:

<?php
	$mysqli=mysqli_connect('localhost','user','password','db');

	if (!$mysqli)
		die("Can't connect to MySQL: ".mysqli_connect_error());

	$id=1;  
	$stmt = $mysqli->prepare("SELECT image FROM images WHERE id=?"); 
	$stmt->bind_param("i", $id);

	$stmt->execute();
	$stmt->store_result();

	$stmt->bind_result($image);
	$stmt->fetch();

	header("Content-Type: image/jpeg");
	echo $image; 
?>

Connect to the database, prepare the SQL statement, bind the parameter(s), execute the statement, bind the result to a variable, and fetch the actual data from the database. In this case there is no need to worry about max_allowed_packet. MySQLi will do all the work:

3925128491.jpg

By the way...

If you want to insert a blob from the command line using MySQL monitor, you can use LOAD_FILE() to fetch the data from a file:

mysql> INSERT INTO images (image) VALUES( LOAD_FILE("/home/oswald/osaka.jpg") );

Be aware that also in this case max_allowed_packet limits the amount of data you're able to send to the database:

mysql> SHOW VARIABLES LIKE 'max_allowed_packet';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| max_allowed_packet | 7168  | 
+--------------------+-------+
1 row in set (0.00 sec)

mysql> INSERT INTO images (image) VALUES( LOAD_FILE("/home/oswald/osaka.jpg") );
ERROR 1048 (23000): Column 'image' cannot be null
mysql> SET @@max_allowed_packet=16777216;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE 'max_allowed_packet';
+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| max_allowed_packet | 16777216 | 
+--------------------+----------+
1 row in set (0.00 sec)

mysql> INSERT INTO images (image) VALUES( LOAD_FILE("/home/oswald/osaka.jpg") );
Query OK, 1 row affected (0.03 sec)

		

Friday Nov 13, 2009

Always messed around with a combo of opendir(), readdir(), and closedir() if you wanted to read the contents of a directory? Since PHP 5 there is a new sheriff in town: scandir():

<?php   
	$files=scandir("/etc/php5");
	print_r($files);
?>

Outputs:

Array
(
    [0] => .
    [1] => ..
    [2] => apache2
    [3] => conf.d
)

Okay, you still need to traverse an array, but it's much easier to use than the traditional way.

Friday Nov 13, 2009

My colleague Brian Overstreet wrote a must-read paper about tuning different components of the Sun GlassFish Web Stack focusing on Apache, MySQL, and PHP: Performance Tuning the Sun GlassFish Web Stack.

Thursday Nov 05, 2009

If you're used to be a VMware user and try to switch to the Open-Source side of the Force by using VirtualBox, you may run into difficulties if you try to import an existing VDI file into VirtualBox. Actually it's quite easy, if you know how.

The main difference between VMware and VirtualBox is that VMware captures a whole virtual machine in an image, whereas VirtualBox only supports images of a hard disk. So in VirtualBox's world, you first need to create a new virtual machine, before using an existing VirtualBox image.

  1. First copy your VDI file into VirtualBox's virtual hard disks repository. On Mac OS X it's $HOME/Library/VirtualBox/HardDisks/.

  2. Start VirtualBox and create a new virtual machine (according to the OS you expect to live on the VirtualBox image):

    virtualbox1.jpg
  3. When you're asked for a hard disk image, select Use existing hard disk and click on the small icon on the right:

    virtualbox2.jpg
  4. Which will brings you to the Virtual Media Manager. Click on Add and select the VDI file from step 1.

    virtualbox3.jpg
  5. After leaving the Virtual Media Manager, you'll be back in your virtual machine wizard. Now you can select your new VDI as existing hard disk and finalize the creation process.

    virtualbox4.jpg
  6. Back in the main window, you're now able to start your new virtual machine:

    virtualbox5.jpg

It's quite easy, if you know how.

Wednesday Nov 04, 2009

My last week blog topic was very much marked by Apache load balancing. Well, I promised to leave this topic alone for a while, but there is one related topic that is worth spending a minute on.

The Theory

If your web application is distributed across multiple servers you'll quickly run in sessions problems because each backend server (aka worker) usually stores its session informations locally. Now, if subsequent HTTP requests are handled by different workers, every time a new sessions is created or, even worse, sessions getting mixed up.

To overcome this problem there are two solutions:

  1. Use a session-aware load balancer that binds a user session to the same worker.
  2. Or keep all session data in a central storage.

Both solutions have the similar drawback: if a worker goes down, all session data of this worker are lost. If the central storage goes down, all sessions are lost. But consider the following: you'll probably have tons of workers, and since every computer is supposed to fail after a specific period of time, the probability of a worker failure is much higher than for a single storage server. It depends on what do you want: A system that runs all the time with small failures or a system that fails completely from time to time?

And finally, losing session data sounds worse than it actually is: usually the users only have to login again to restore their session data. That's sad, but it's not the end of the world. Okay, your system may get into trouble if thousands of users try to re-login at the same time, but that's another problem.

The Solution

My favorite solution is the second one: keep all session data in a central place. And in this scenario I'll use Apache/PHP as my "application server" and memcached as central storage for my session data. If you read and still remember the title of this post, you're probably not surprised.

phpmemcached.jpg

On the left: my load balancer, in the middle my worker farm, and on the right: my single and central memcached server. By the way: You can also have multiple memcached servers, but for this blog post I'll keep it simple.

The Requirements

First, let's check if PHP was build with memcached support:

serverA ~% php -m | egrep memcache
memcache

...on each worker node: serverA to serverD.

Second, I check if memcached is running on serverM:

serverM ~% ps -efa | egrep memcached
oswald  1543     1   0 15:21:17 ?         0:00 /home/oswald/webstack1.5/lib/memcached -d ...

Perfecto.

The Configuration

Now I need to change the PHP configuration on each worker node: Open php.ini on serverA to serverD and search for these lines:

[Session]
; Handler used to store/retrieve data.
session.save_handler = files

And change the configuration like this:

[Session]
; Handler used to store/retrieve data.
session.save_handler = files
session.save_handler = memcache
session.save_path = "tcp://serverM:11211"

Make sure that the settings are the same on all your workers.

That's all. Yes, that's the basic configuration. PHP's sessions will now get stored on the memcached node serverM. No more magic needed.

The Proof

But as we say in Germany: "Prudence is the mother of the china cabinet." Before we can grab the beer, we should make sure everything works as we expect it to.

I put this code in a file named session.php in the document root directory of all my worker nodes:

<?php   
	session_start();
	if(isset($_SESSION['zaphod']))
	{       
		echo "Zaphod is ".$_SESSION['zaphod']."!\n";
	}       
	else    
	{       
		echo "Session ID: ".session_id()."\n"; 
		echo "Session Name: ".session_name()."\n";
		echo "Setting 'zaphod' to 'cool'\n";
		$_SESSION['zaphod']='cool';
	}       
?>

From the outside I use lynx to access this file:

% lynx -source 'http://serverA/session.php'
Session ID: df58bc9465f27aa20218c11caba6750f
Session Name: PHPSESSID
Setting 'zaphod' to 'cool'

A new session with the ID df58bc9465f27aa20218c11caba6750f was created and PHP uses the session name PHPSESSID to identify the session parameter. And the session variable zaphod was set to the value cool.

Now I add the session information PHPSESSID=df58bc9465f27aa20218c11caba6750f to my URL and rerun the new lynx command:

% lynx -source 'http://serverA/session.php?PHPSESSID=df58bc9465f27aa20218c11caba6750f'
Zaphod is cool!

Yes, I got the expected output: Zaphod is cool! Proving the session data is available on serverA. But that's not a big surprise, what's about the other nodes? I replace serverA with serverB in my URL:

% lynx -source 'http://serverB/session.php?PHPSESSID=df58bc9465f27aa20218c11caba6750f'
Zaphod is cool!

Bingo, serverB also has the same session data as serverA.

And for serverC? It's also the same:

% lynx -source 'http://serverC/session.php?PHPSESSID=df58bc9465f27aa20218c11caba6750f'
Zaphod is cool!

And so on... for each worker node the session data will be the same.

A dream came true.

Thursday Oct 29, 2009

vb200x150.png

In December, I'll give a talk at SAPO Codebits 2009 in Lisbon, Portual. SAPO Codebits is a hacking event held in Portugal annually, completely organized and sponsored by SAPO, a portuguese ISP and subsidiary company of the Portugal Telecom Group.

I will be speaking about web server architectures, web services in general and discuss the pros and cons of different programing languages (like PHP, Java, Python, Ruby, Perl, JavaScript, ASP.NET) and database technologies in the field of web application development, deployment and hosting. In order to save time and keep development costs at a reasonable level, it's very important to identify system flaws and architectural weaknesses in an early stage of the development process.

The talk shows pitfalls and common mistakes developers make when building web-based applications and also provides useful hints how to avoid them in an early stage. The talk ends with a quick introduction of horizontal and vertical scaling.

So if you happen to be there, drop me a line or simply come and say hello.

PS: I heard Lenz will give a talk there too. Great news!

Thursday Oct 29, 2009

Since the Apache load balancer seems to be my topic of the week, let's focus on another related question: What happens if a worker (backend server) doesn't show up for work?

Let's say server B needed to go down for maintenance and is no longer available for the cluster:

loadbalancer2bsc.jpg

For this example I simply shut server B's Apache daemon down. I made no other changes to my configuration. And voila:

# repeat 12 lynx -source http://loadbalancer
This is A.
This is C.
This is D.
This is A.
This is C.
This is D.
This is A.
This is C.
This is D.
This is A.
This is C.
This is D.

The load balancer automatically notices that server B isn't available any more and simply skips it while cycling though his list of workers.

After getting server B up again, it takes 60 seconds (configurable default value) until server B shows up again in my cluster:

# repeat 12 lynx -source http://loadbalancer
This is A.
This is B.
This is C.
This is D.
This is A.
This is B.
This is C.
This is D.
This is A.
This is B.
This is C.
This is D.

Nice.

Wednesday Oct 28, 2009

HTTP load balancers have one natural enemy: redirections. For example, a "trailing slash" redirect is issued when the server receives a request for a URL http://servername/dir where dir is a directory. In such a case the server redirects the browser to http://servername/dirname/ (including the trailing slash):

# lynx -mime_header http://loadbalancer/dir | egrep Location:
Location: http://serverA/dir/
# lynx -mime_header http://loadbalancer/dir | egrep Location:
Location: http://serverB/dir/

Accessing http://loadbalancer/dir will result in a redirect to http://serverA/dir/ (if it's serverA's turn) instead of http://loadbalancer/dir/. This happens because serverA simply doesn't know about the load balancer at all.

The solution is to tell the load balancer to rewrite all serverX addresses to the load balancer's address:

	ProxyPassReverse / http://serverA/
	ProxyPassReverse / http://serverB/
	ProxyPassReverse / http://serverC/
	ProxyPassReverse / http://serverD/

Now all server generated redirects will get rewritten to the load balancers address:

# lynx -mime_header http://loadbalancer/dir | egrep Location:
Location: http://loadbalancer/dir/

Of course in real life the load balancer address would be something like http://www.sun.com.

Tuesday Oct 27, 2009

Usually a single AMP system is enough to serve - let's say - around 500 concurrent users. Sometimes more, sometimes less, strongly depending on the particular web application, the overall architecture of your system, of course the hardware itself, and how you define "concurrent users".

Nevertheless, if your server gets too slow, you'll need to take actions. You may upgrade your server up to the maximum (aka vertical scaling), optimize your software (aka refactoring), and finally add more servers (aka horizontal scaling). The whole process of horizontal scaling is quite complex and far too much for a single blog post, but here's a first shot. Others will follow.

Today I'll focus on one single aspect of horizontal scaling: an HTTP load balancer.

loadbalancer1bsc.jpg

On the left: a whole crowd of people ready to visit our web site. On the right: our server farm (called workers). And in the middle: our current hero, the load balancer. The purpose of the load balancer (in this case an HTTP load balancer) is to distribute all incoming requests to our backend web servers. The load balancer hides all our backend servers to the public, and from the outside it looks like a single server doing all of the work.

The Recipe

Okay, let's start. Step by step.

  1. Since version 2.2 the Apache web server ships a load balancer module called mod_proxy_balancer. All you need to do is to enable this module and the modules mod_proxy and mod_proxy_http:

    LoadModule proxy_module mod_proxy.so
    LoadModule proxy_http_module mod_proxy_http.so
    LoadModule proxy_balancer_module mod_proxy_balancer.so

    Please don't forget to load mod_proxy_http, because you wouldn't get any error messages if it's not loaded. The balancer just won't work.

  2. Because mod_proxy makes Apache become an (open) proxy server, and open proxy servers are dangerous both to your network and to the Internet at large, I completely disable this feature:

    	ProxyRequests Off
    	<Proxy *>
    		Order deny,allow
    		Deny from all
    	</Proxy>
    

    The load balancer doesn't need this feature at all.

  3. Now I need to make sure all my backend web servers have the same content:

    serverA htdocs% cat index.html
    This is A.
    serverB htdocs% cat index.html
    This is B.
    serverC htdocs% cat index.html
    This is C.
    serverD htdocs% cat index.html
    This is D.

    Okay, in this case the content differs, but I need this to show how the load balancer works.

  4. And here's the actual load balancer configuration:

    	<Proxy balancer://clusterABCD>
    		BalancerMember http://serverA
    		BalancerMember http://serverB
    		BalancerMember http://serverC
    		BalancerMember http://serverD
    		Order allow,deny
    		Allow from all
    	</Proxy>
    	ProxyPass / balancer://clusterABCD/
    

    The <Proxy>...</Proxy> container defines which backend servers belong to my balancer. I chose the name clusterABCD for this server group, but you are free to choose any name you want.

    And the ProxyPass directive instructs the Apache to forward all incoming requests to this group of backend servers.

  5. That's all? Yes, that's all. Here's the prove:

    # repeat 12 lynx -source http://loadbalancer
    This is A.
    This is B.
    This is C.
    This is D.
    This is A.
    This is B.
    This is C.
    This is D.
    This is A.
    This is B.
    This is C.
    This is D.

    Each request to the load balancer is forwarded to one of the backend servers. By default Apache simply counts the number of requests and makes sure every backend server gets the same amount of requests forwarded.

    If you want to know more about available balancing algorithms please refer to Apache's mod_proxy_balancer manual.

Did you ever imagine setting up a load balancer would be this easy? Of course, there is more to say about (HTTP) load balancing and much more about vertical scaling too, but this is only a blog posting and not a place for such an expansive reference. If time and space allows I'll go into further details on this in the near future.

Monday Oct 26, 2009

Sun Microsystem's blogs.sun.com employee blogging site (affectionately named BSC) uses Apache Roller to manage the site and house all the blogs. Roller is an open source Java blog software that for example also drives the US Government's blog.usa.gov and the IBM Developer Works blogs. But there is one feature I really missed: An entry counter in my sidebar's category list. It's a standard feature in the Wordpress world, but in Roller I didn't found any option enabling such a counter.

categoriecounter.png

But as they say, "If the mountain won't come to Mohammed, then Mohammed will go to the mountain": If the feature will not come to me, then I will need to take care by myself.

And here is my mountain:

#set($rootCategory = $model.weblog.getWeblogCategory("nil"))

#set($cats = $rootCategory.getWeblogCategories())
#foreach($cat in $cats)
  #set($entriesList = $model.weblog.getRecentWeblogEntries($cat.name, 500))
  #set($count = $entriesList.size())
  #if($model.weblogCategory && $model.weblogCategory.path == $cat.path)
    <li class="selected"><a href="$url.category($cat.path)">$cat.name ($count)</a></li>
  #else
    <li><a href="$url.category($cat.path)">$cat.name ($count)</a></li>
  #end
#end

It's quite easy and straight forward: Get a list of all categories, for each category get all the blog entries (in this case limited to 500, because I don't know how this scales), count the entries, and finally generate some HTML.

If everything would be that easy! :)

Thursday Oct 22, 2009

Yesterday, I started a small tutorial on how to implement typographic headlines with PHP. There were some aspects to be aware of, but in general it was an easy and straight forward process. The final result looked like this:

headline_example-8bit.png

But there was one big issue I had with my script: It was far to slow (33 requests per second) for use in a production environment. But today, I'll extend my previous script with a simple caching mechanism to make it ready for the real world.

Welcome to the Pleasuredome

Basically that's where I left yesterday:

$font_file="./FFF Tusj.ttf";
$font_size=64;
$text = "An Example";

$bb = imagettfbbox($font_size,0,$font_file,$text);

$bb_width = $bb[4]-$bb[6];
$bb_height = $bb[3]-$bb[5];

$image  =  imagecreate($bb_width, $bb_height);     

$fillcolor  =  imagecolorallocate($image, 255, 255, 255);    
$fontcolor  =  imagecolorallocate($image, 69, 138, 186);    

imagefill($image, 0, 0, $fillcolor);    

imagettftext($image, $font_size, 0, abs($bb[6]), $bb_height-$bb[3], $fontcolor, 
		$font_file, $text);

header("Content-Type: image/png");
imagepng($image);

I'll put this in a function - let's say - fancyheadline($text) and will add two lines of code at the beginning of this function and change the last lines a little bit.

The final script will look like this (changes highlighted):

function fancyheadline($text)
{
	$cache="cache/".md5($text).".png";
	if(!file_exists($cache))
	{       
		$font_file="./FFF Tusj.ttf";
		$font_size=64;

		$bb = imagettfbbox($font_size,0,$font_file,$text);

		$bb_width = $bb[4]-$bb[6];
		$bb_height = $bb[3]-$bb[5];

		$image  =  imagecreate($bb_width, $bb_height);    

		$fillcolor  =  imagecolorallocate($image, 255, 255, 255);    
		$fontcolor  =  imagecolorallocate($image, 69, 138, 186);    

		imagefill($image, 0, 0, $fillcolor);    

		imagettftext($image, $font_size, 0, abs($bb[6]), $bb_height-$bb[3],
				$fontcolor, $font_file, $text); 

		header("Content-Type: image/png");
		imagepng($image,$cache);
	}       
	return '<img src="'.$cache.'" alt="'.htmlspecialchars($text).'">';

}
echo fancyheadline("An Example");

What's happening here is that I first generate a MD5 hash of my headline string $text and check if a cache file containing this hash exists or not. If it doesn't exists, create the file containing the rendered headline image. If it exists, do nothing.

At the end of the script I return a string consisting of an IMG HTML tag which would display the cached image file. (With htmlspecialchars () I convert special characters like ", &, < and > to their corresponding HTML entities, because these chars may break the validity of HTML.)

Before I can run my script, I need to create the cache directory and give it appropriate permissions:

% mkdir cache
% chmod a+rwx cache

On a production server you probably don't want to give write permissions to everyone, but will change the ownership of the directory to www-data or whatever user my web server runs as.

Now it's time to access the script in my trusted browser:

examplewcache.jpg

This looks exactly like my original yesterday's example above, but this time the script generates the headline image "in the background" and only outputs some HTML code referring to this image:

<img src="cache/e6cf1c67e6acfa204bb784cd6b25839f.png" alt="An Example">

In other words: I reduced the use and need of PHP as much as possible.

Final words by ApacheBench

First let's check the PHP script itself:

% ab -n 1000 http://demo/headline.php
This is ApacheBench, Version 2.3
...
Document Path:          /headline.php
Document Length:        71 bytes
...
Requests per second:    1801.51 [#/sec] (mean)
...

1800 requests per second. Yes, that's what I wanted to hear. But it's easy to explain: The headline image is generated only once, upon the first request. And for all following requests the script only refers to the already generated image.

And if I benchmark the image itself:

% ab -n 1000 http://demo/cache/e6cf1c67e6acfa204bb784cd6b25839f.png
This is ApacheBench, Version 2.3
...
Document Path:          /cache/e6cf1c67e6acfa204bb784cd6b25839f.png
Document Length:        7888 bytes
...
Requests per second:    3308.09 [#/sec] (mean)
...

Of course, because it's now just static data, and I'll get the most performance possible out of my server.

I started with 33 requests per second and ended somewhere in between of 1800 and 3300 requests per second.

Moving at one million miles an hour... Welcome to the Pleasuredome!

Wednesday Oct 21, 2009

My recent blog post about scaling images with PHP gave me the idea to write something about creating typographic headlines with PHP. At Apache Friends we're using this technique since many years to get rid of the usual boring and everywhere available "web fonts" like Helvetica, Times and Verdana.

For this example I chose the font Tusj by Norwegian graphic designer Magnus Cederholm. Okay, this font will only work for very large headlines, but it's looks cool and it's perfect for this demo's purposes because the TTF file is very huge (1.5 MB) and that makes the processing in PHP quite slow. (Yes, in this case, I want to slow down my PHP script.)

The Basics

First, I define some basic parameter: TTF font file, the font size, and an example text.

$font_file = "./FFF Tusj.ttf";
$font_size = 64;
$text = "An Example";

In the next step I've to find out, how big my image needs to be to take the rendered text. That's not so trivial because it strongly depends on the used characters, the choosen size and of course the font itself. To solve this problem PHP offers the imagettfbbox() function:

$bb = imagettfbbox($font_size, 0, $font_file, $text);
$bb_width = $bb[4]-$bb[6];
$bb_height = $bb[3]-$bb[5];

With imagecreate() I now can create the image using $bb_width and $bb_width for the size:

$image = imagecreate($bb_width, $bb_height);    

Define two colors: one for the background and one for the foreground.

$fillcolor = imagecolorallocate($image, 255, 255, 255);    
$fontcolor = imagecolorallocate($image, 69, 138, 186);    

Fill the background using $fillcolor:

imagefill($image, 0, 0, $fillcolor);    

Render the text to the image:

imagettftext($image, $font_size, 0, abs($bb[6]),$bb_height-$bb[3], $fontcolor,
		$font_file, $text);

I don't want to go into the details of this function, for a detailed explanation of all parameters please refer to the PHP manual.

And finally send it to the browser:

header("Content-Type: image/png");   
imagepng($image);

Basically that's all you need to do. Here's the output of the above PHP script:

headline_example.png

Because I used imagecreate() and imagepng() the file I got is an indexed-colored 8-bit PNG with a file size of 8 KB.

Some Variations

If I use imagecreatetruecolor() and imagepng() I will get a truecolored 24-bit PNG with a file size of 41 KB:

headline_example-8bit.png

And if I use imagejpeg() instead of imagepng() I will get a truecolored 24-bit JPEG with a quality setting of 100 and a file size of 41 KB:

headline_example_jpeg.jpg

All three variations look exactly the same, but the indexd-colored 8-bit PNG is the smallest one. So for this purpose an 8-bit PNG seems to be the best choice.

Turning off antialiasing?

By default imagettftext() uses an antialiasing algorithm to smooth the output. Using the negative of a color index turns this feature off:

imagettftext ($image, $font_size, 0, abs($bb[6]),$bb_height-$bb[3], -$font_color, 
		$font_file, $text);

Sometimes (usually in case of small font sizes) this will give you a sharper and better looking result, but in this special case it definitely looks worse:

headline_example-8bit-wo-antialias.png

Welcome to the Real World

As I mentioned in the beginning, I intentionally chose a font based on a very large TTF file, which makes it very expensive for PHP to render a headline. Let's take a look at some quick benchmark results:

% ab -n 1000 http://demo/headline.php
This is ApacheBench, Version 2.3
...
Benchmarking demo (be patient)
...
Document Path:          /headline.php
Document Length:        7888 bytes
...
Requests per second:    33.52 [#/sec] (mean)
...

Autsch... 33 requests per second is far to slow for a real world scenario. Yes, if I had chosen a smaller font the results would be much better, but probably the script will still be not suitable for use in a production environment. However, a simple caching mechanism should easily solve this issue.

But not today, stay tuned for part 2 of this tutorial. Live long and prosper.

Tuesday Oct 20, 2009

Scaling images in PHP is quite easy, but there are some things to consider. (If you're short of time, right at the end you'll find the final script.)

Read the original image with imagecreatefromjpeg()

First of all you'll need to read the original image. If it's a JPEG file the imagecreatefromjpeg() function is the right choice:

	$source_image = imagecreatefromjpeg("osaka.jpg");

If it's a GIF file you'll take imagecreatefromgif(), and if it's a PNG you will prefer imagecreatefrompng().

For this small tutorial I'll use this image from the Osaka Aquarium Kaiyukan.

Get the size of the original image: getimagesize() vs. imagesx() and imagesy()

Reading the image is quite easy, but the first pitfall you'll encounter if you prepare to scale an image, and need to find out the dimensions of the original image.

Most tutorials will propose this way:

	$source_image = imagecreatefromjpeg("osaka.jpg");
	$source_image_size = getimagesize("osaka.jpg");

The problem with the getimagesize() function is that it needs to reopen the file to get the actual image size. This is usually not a big issue if you're reading a local file, but it will get critical if you're reading a file from the network like in:

	$source_image = imagecreatefromjpeg("http://someurl/osaka.jpg");
	$source_image_size = getimagesize("http://someurl/osaka.jpg");

Every time you call getimagesize() the whole image file get transferred over the network, and that quickly became mission-critical. Since you already have the image loaded with imagecreatefromjpeg() there is no need to load it again:

	$source_image = imagecreatefromjpeg("osaka.jpg");
	$source_imagex = imagesx($source_image);
	$source_imagey = imagesy($source_image);

Create the destination image: imagecreate() vs. imagecreatetruecolor()

Now we need to prepare the (scaled) destination image. There are two PHP functions which can be used to create an image: imagecreate() and imagecreatetruecolor(). The first creates a palette based image (with a maximum of 256 different colors), and the second one creates a true color image (with a maximum - as far as I know - of 256*256*256 = 16 million colors).

Let's compare the results of both function: On the left imagecreate() and on the right imagecreatetruecolor():

scaled_256colors.jpgoriginal_scaled.jpg

It's obvious: As long as you work with photographic images you'll need more than 256 colors.

So let's decide to use imagecreatetruecolor() and define a target size of 300x200 pixels:

	$dest_imagex = 300;
	$dest_imagey = 200;
	$dest_image = imagecreatetruecolor($dest_imagex, $dest_imagey);

Scale the image: imagecopyresized() vs. imagecopyresampled()

Now it's time to do the actual scaling of the image. Again PHP offers to function for this purpose: imagecopyresized() and imagecopyresampled(). The first one uses a very simple algorithm to scale the image, it's fast but the quality is really poor. The second one uses a better, but slower algorithm, resulting in a very high quality image.

Poor quality, but fast:

	imagecopyresized($dest_image, $source_image, 0, 0, 0, 0, 
				$dest_imagex, $dest_imagey, $source_imagex, $source_imagey);

Best quality, but slow:

	imagecopyresampled($dest_image, $source_image, 0, 0, 0, 0, 
				$dest_imagex, $dest_imagey, $source_imagex, $source_imagey);

I don't want to go into the details of this functions, for a detailed explanation of all parameters please refer to the PHP manual.

Let's compare the results: On the left imagecopyresized() and on the right imagecopyresampled():

resized.jpgresampled.jpg

Again it's obvious: The quality of imagecopyresampled() is much better. In my opinion there is never a reason to use the faster imagecopyresized(). Why would I ever want a low quality image? Even if it's faster to get?

And finally push the image to the browser: imagejpeg() vs. imagepng()

After scaling the image, we now need to push the image to the user's browser. Probably the most popular image formats in the Internet are currently PNG and JPEG. Both will work great with photographic images but true-colored and loss-less PNG images usually results in larger file sizes than JPEG images.

To send a PNG image (with best compression rate 9) to the browser:

	header("Content-Type: image/png");
	imagepng($dest_image,NULL,9);

Or a JPEG image (with best quality 100):

	header("Content-Type: image/jpeg");
	imagejpeg($dest_image,NULL,100);

And in comparison, imagepng() on the left vs. imagejpeg() on the right:

scalepng.pngscaledjpg-100.jpg

Both look absolutely the same, but the JPEG image has a size of 57 KB (using the best quality of 100) and the PNG image is 102 KB big (using the highest available compression rate).

What's the best JPEG quality to choose?

JPEG images are not only smaller but also give you the flexibility to choose the quality and by this indirectly the file size. In PHP you can choose the quality in a range from 0 (worst quality, smaller file) to 100 (best quality, biggest file). Let' take a look.

Quality 100 (57 KB) and quality 80 (16 KB):

scaledjpg-100.jpg scaledjpeg-80.jpg

If you look very carefully at the quality 80 version on the right, you'll see very small artifacts by the JPEG compression.

Quality 60 (12 KB) and quality 40 (8 KB):

scaledjpeg-60.jpg scaledjpeg-40.jpg

The loss of quality gets worse, and in my opinion the quality 40 image on the right looks terrible.

And now the whole script...

Many words end in a small script:

<?php   
	$source_image = imagecreatefromjpeg("osaka.jpg");
	$source_imagex = imagesx($source_image);
	$source_imagey = imagesy($source_image);

	$dest_imagex = 300;
	$dest_imagey = 200;
	$dest_image = imagecreatetruecolor($dest_imagex, $dest_imagey);

	imagecopyresampled($dest_image, $source_image, 0, 0, 0, 0, $dest_imagex, 
				$dest_imagey, $source_imagex, $source_imagey);

	header("Content-Type: image/jpeg");
	imagejpeg($dest_image,NULL,80);
?>

In this script I used a quality of 80, that's just my personal preference. You may choose whatever you like. But please, not less than 40.

Postscript

In many tutorials the PHP script ends with several imagedestroy() function calls. imagedestroy() frees any memory associated with an image. This is a good idea if you sequentially work with different image resources within a single PHP script. But if the imagedestroy() is right at the end of a script, you may omit this function. When the script ends PHP will automatically free any resources.

Wednesday Oct 14, 2009

Today I want to show you a quick installation walkthrough of Sun GlassFish Web Stack. I'm using Solaris 10 in this walkthrough, but installation on RHEL is absolutely the same. As a small deployment example for a web application I'll do an installation of WordPress.

Web Stack Installation

  1. Okay, first step: Get the Sun GlassFish Web Stack.

    Simply enter http://sun.com/webstack in your browser.

    sws_00.jpg

    After clicking on the Get It button, you're asked to pick your platform: Red Hat Enterprise Linux or Solaris 10. (You may wonder why there is no OpenSolias download, that's quite simple: because the Web Stack is already sipped with OpenSolaris (2009.06), all you need to do to install the main components is to do a pkg install amp. And you'll get the Apache, MySQL and PHP components of Web Stack installed right on your system.)

    After picking the platform, you need to decide which kind of distribution you want to download: native packages or the IPS-based distribution:

    sws_01.jpg

    The native packaging distribution (aka RPM/SVR4) differs two version: one including Java-based components like Tomcat, GlassFish and Hudson, and one without.

    For this demo I picked my personal favorite, the IPS-based distribution, because it allows me a non-root install and gives me the ability to place my Web Stack anywhere I want in the directory tree of my system. So I don't need to have root access and can install Web Stack as a regular user.

  2. After the download is finished I open a terminal and take a look at my system environment:

    [oswald@localhost ~]$ df -h
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/mapper/VolGroup00-LogVol00
                          7.2G  2.3G  4.5G  34% /
    /dev/hda1              99M   12M   83M  13% /boot
    tmpfs                 125M     0  125M   0% /dev/shm
    [oswald@localhost ~]$ pwd
    /home/oswald
    [oswald@localhost ~]$ id
    uid=500(oswald) gid=500(oswald) groups=500(oswald)
    [oswald@localhost ~]$ ls -l
    total 14452
    -rw-r--r--  1 oswald 14765264 Oct 12 14:43 webstack-image-1.5-b09-redhat-i586.zip

    My current working directory is my home directory. There is enough space left on my file system. I'm a non-root, regular user, and I see the downloaded IPS installer image.

  3. Now I'm unzipping the image to my home directory:

    [oswald@localhost ~]$ unzip -q webstack-image-1.5-b09-redhat-i586.zip 

    This creates a directory named webstack1.5 by default. You can rename it to anything you want or simply keep the default name. In this case I choose to name it demo:

    [oswald@localhost ~]$ mv webstack1.5 demo
  4. Now I change into that directory and start the Update Tool:

    [oswald@localhost ~]$ cd demo
    [oswald@localhost demo]$ bin/updatetool 
    sws_02.jpg

    So, what we're seeing here is the update tool. Our current Sun GlassFish Web Stack installation is highlighted on the left hand side. If you have other products installed or multiple Web Stack installs, all these images will show up too. In this case I've only one image and once I click on Available Add-ons.

    Now I pick all the components I want to use in this demo, so I select Apache HTTP server, MySQL server, PHP Server and the PHP MySQL connector. By the way, if I realize I need more packages later on, I can come back here anytime and install whatever I need. For now, I've selected all I want at this point, and all I have to do now is to press the beautiful green install button above the package list.

  5. After downloading, I get back to the shell and use the command line tool pkg to review the list of installed packages:

    oswald@localhost demo]$ bin/pkg list
    NAME (PUBLISHER)                              VERSION         STATE      UFIX
    pkg                                           1.111.3-30.2210 installed  ----
    pkg-toolkit-incorporation                     2.2.1-30.2210   installed  ----
    python2.4-minimal                             2.4.4.0-30.2210 installed  ----
    sun-apache22                                  2.2.11-1.5      installed  ----
    sun-mysql51                                   5.1.30-1.5      installed  ----
    sun-mysql51lib                                5.1.30-1.5      installed  ----
    sun-php52                                     5.2.9-1.5       installed  ----
    sun-php52-mysql                               5.2.9-1.5       installed  ----
    sun-wsbase                                    1.5-1.5         installed  ----
    updatetool                                    2.2.1-30.2210   installed  ----
    wxpython2.8-minimal                           2.8.8-30.2210   installed  ----

    You can also use pkg to install components, like in the Update Tool. So for the terminal addicted: if you don't like a GUI, there is also a command-line alternative.

  6. Okay, let's go on. I now start the Apache web server:

    [oswald@localhost demo]$ bin/sun-apache22 start
    Starting apache22

    And to check if the Apache is really running and working I'll take my browser to http://localhost. But wait a second...

    As we probably all know, by default, the Apache web server runs on port 80, which is the default port on which most Web servers run. But in the Unix world all port numbers below 1024 require root permissions to bind. And since I started Apache as a normal non-root user, the Apache will be unable to bind to port 80.

    If you use Web Stack as non-root user, Web Stack will add 10000 to every port number below 1024. So in this case Apache will use 10080 as it's favorite port.

  7. At http://localhost:10080/ we find the welcome page of Sun GlassFish Web Stack:

    sws_03.jpg
  8. But what's life with just an Apache, let's also start the MySQL and let the database server join our team:

    [oswald@localhost demo]$ bin/sun-mysql51 start
    Starting mysql51
  9. Because everyone on my system and on my network can now access my MySQL server, I strongly need to secure my installation by setting a password for MySQL's root user:
    [oswald@localhost demo]$ mysql/5.1/bin/mysqladmin -u root password "demo"

    Okay, demo is probably not a secure password for a production environment, but for this demo purposes it's a very appropriate one.

That's all. Your AMP stack is now ready.

Example Deployment

As an example I will now install the famous blogging software WordPress within my new AMP stack. Not because it's so difficult to do, but it's a good and pragmatic way to show the AMP stack is working and out-of-the-box capable to run popular web applications.

  1. First I need to create a database and a user for WordPress to access the database and to store its data.

    Okay, that should be an everyday task for a MySQL DBA:

    [oswald@localhost demo]$ bin/mysql -uroot -pdemo
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 2
    Server version: 5.1.30-log Source distribution
    
    Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
    
    mysql> CREATE DATABASE wordpress;
    Query OK, 1 row affected (0.01 sec)
    
    mysql> GRANT ALL PRIVILEGES ON wordpress.* TO 'wordpress'@'localhost' \
          IDENTIFIED BY 'demo';
    Query OK, 0 rows affected (0.98 sec)
    
    mysql> QUIT

    Get into the MySQL monitor, aka the MySQL command line tool, to connect to the database server. I use the CREATE DATABASE statement to create a database named wordpress. And with the GRANT statement I create a user named wordpress identified by the password demo. Again this is probably not a good password for production use, but for a demo this is perfecto.

  2. After changing into Apache's document root directory I wget the latest wordpress version:
    [oswald@localhost demo]$ cd var/apache2/2.2/htdocs
    [oswald@localhost htdocs] wget -q http://wordpress.org/latest.tar.gz
    

    With GNU tar I extract the archive:

    [oswald@localhost htdocs]$ /usr/sfw/bin/tar xfz wordpress-2.8.4.tar.gz

    This creates a directory called wordpress.

  3. Now I can access my WordPress installation by pointing my browser to http://localhost:10080/wordpress.

    sws_05.jpg

    From now on it's very easy. Simply enter the details for the database server connection:

    The database name: wordpress.
    The user name: wordpress.
    The password: demo.
    The database host: localhost.
    And the table prefix wp_.

    And on the next page:

    sws_06.jpg

    I choose a name for my blog, and enter my email address. Press Install WordPress. And now it only takes a few moments and the installation is done. (Write down your random admin password, you will probably need it later.)

  4. Now I want to check if the Wordpress installation was successful and again point my browser to http://localhost:10080/wordpress.

    sws_04.jpg

    And, voilĂ , there it is: my shiny new "Sun GlassFish Web Stack" blog. Hooray.

That's all. Welcome to the wonderful world of AMP.

This blog copyright 2009 by Kai Seidler