Amazon S3 HEAD Request 403 Solution

I wanted to check if some podcast mp3 files are still available/online. So my first idea was to use cURL and make a HEAD request for each file. And this where it got weird.

Some of the files were hosted on foursquare.com which it -turns out- internally are being redirected to an Amazon S3 cloud bucket.

So step 1 was to make sure, the cURL HEAD request is following all the redirects. No problem here.

    curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );

But then some weird thing came up: The HEAD request was returning a 403 HTTP status code (FORBIDDEN) while the files could be opened in a browser – so they were accessible, right?!

A look into the Amazon documentation offered two explanations:

  1. HEAD requests require authentification (the request has to be signed the Amazon way) and
  2. 403 will be returned if you don’t have the permission to READ the target file.

I will spare you the multiple false paths I took and immediately jump to the conclusions:

The short answer is: Amazon S3 does not allow unsigned HEAD requests at all – for anyone else than the owner!

But there’s a little “hack” if you will. Instead of making a full GET request you can use cURL to request only a view bytes (I think the first 4 are reasonable enough for this).

    curl_setopt( $ch, CURLOPT_RANGE, "0-4" );

What you will get is a) some tiny chunk of the target file and b) the proper(!) HTTP status code for this file. In my case this was a 206 (Partial Content) because the mp3 were meant to be streamed (I guess).

The advantage of requesting the first few bytes is that you will get the proper status code. The downside is that this is much slower than a HEAD request. I mean, a few bytes are not the end of the world but the time for such requests add up!

So my final approach for solving this problem is to do a HEAD request first. And only if the status code is >= 400 I will do a “chunk request” (as I call it) and see if the status code for this is different to the HEAD request.

And this is how my method (of my Request class) now looks like:

  /**
   * Request a chunk of bytes from a targeted URL,
   * mainly to get the "actual" HTTP status code.
   *
   * @param   string  $url        Targeted URL - e.g. a mp3 file stored on the Amazon S3 cloud storage...
   * @param   int     $timeout    Timeout for making the request, in seconds.
   * @param   int     $bytes      Number of FIRST bytes to be requested, instead of loading the full content.
   * @return  array               An array containing 'http_status_code', 'content', 'error'
   */
  public function requestBytes( $url, $timeout = null, $bytes = 4 ) {
    
    $timeout        = ( $timeout > 0 )  ? (int) $timeout  : $this->defaultTimeout; // $this->defaultTimeout is -who would have thought- where the default value is stored
    
    $bytes          = ( $bytes < 4 )    ? 4               : (int) $bytes;


    $ch             = curl_init();
    
    curl_setopt( $ch, CURLOPT_URL, $url );

    curl_setopt( $ch, CURLOPT_TIMEOUT, $timeout );

    /*  Follow HTTP redirects (like 301s and 302s). */
    curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );

    /* Return value as string instead of outputting it directly */
    curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
    
    /* Only read/request the FIRST n bytes! */
    curl_setopt( $ch, CURLOPT_RANGE, "0-" . $bytes );
    

    /* The chunk bytes */
    $chunk          = curl_exec( $ch );

    /* The status code */
    $httpStatusCode = curl_getinfo( $ch, CURLINFO_HTTP_CODE );

    $curlError      = curl_errno( $ch );

    curl_close( $ch );
    
    return array(
      'http_status_code'  => $httpStatusCode,
      'content'           => $chunk,
      'error'             => $curlError
    );
        
  }

This whole thing already cost me a full afternoon with all the research and reading up. And I still hope there’s a better (faster) solution for this. So I’m open for suggestions anytime!

The Apple SSL Bug and What PHP Developers Can Learn From It

A bug in Apple’s code for exchaning SSL keys has been found and published. Even when you only take a quick look the issue here is pretty clear.

The code is implementing a bunch of if-statements followed by only a sinlge action each time. In C -as well as PHP- you don’t have to wrap a lonely action into curly brackets – as you would have to when you’re dealing with more than one.

Though it’s a known best practice to always wrap your code into such curly brackets, no matter how silly you think this is, you can see this kind of laziness all the time.

The problem is that as soon as you’re adding more code you might not see what still belongs to if-statement and what doesn’t.

And this is what happend here: At some point the SSL code always jumps to a sub routine and doesn’t even execute any code below that.

opensource.apple.com-20140224-0805


if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;

What should have been written instead was either


if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) {
        goto fail;
        goto fail;
}

in which case everything would have been just fine. Or


if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) {
        goto fail;
}
        goto fail;

in which case the bug would still be there but would have been found much easier since it’s clear that the second goto fail running on its own.

To me it looks like this has been a simple copy/paste mistake. It’s not Apple specific by any means but I think it’s great if you can argue by pointing to such a prominent reference!

Again and to sum it up: Always wrap your conditionals into curly brackets!

Related Links

Custom Sort WordPress Results

I’ve heard this question at least a million times now: How do I sort a custom query or a category list just the way I want it?

I’ve actually written a little helper function ages ago. And I’m aware there are other ways but this one works just fine in almost every case.

Just add this function to your theme’s function.php or your plugin code:

/**
 * Sorts a WordPress result array by custom criterias
 * Works for all kinds of results that are arrays of the form $result[0]-&gt;memberVariable1 etc.
 *
 * By Chris Doerr (http://wordpress.org/support/profile/zenation)
 *
 * @param   array   $results    WordPress results
 * @param   string  $by         Member variable to sort by.
 * @param   string  $order      (optional) 'desc' for 'descending', everything else will be treated as 'ascending'
 * @return  array               The newly sorted results array.
 */
if( !function_exists( 'sortWPResults' ) ) {
  function sortWpResults( $results, $by, $order = 'asc' ) {

    $by         = strtolower( $by );

    $orderComp  = ( strtolower( $order ) == 'desc' ) ? ' -1 : +1' : ' +1 : -1';

    usort( $results, create_function( '$a,$b', 'return ( $a-&gt;' . $by . ' &gt; $b-&gt;' . $by  . ' ) ? ' . $orderComp . ';' ) );

    return $results;

  }
}

“WordPress results” can be a lot of things: Custom queries (WP_Query), get_post(), get_the_category(), to name just a few. A demo usage could then be something like

$categories = sortWpResults( get_the_category(), 'slug', 'asc' );

which would get a list of all categories, sorted by the SLUG (not name!).

Software Versioning

There are many, many ways for naming versions of your software but only a few that really make sense to me. And I’m not saying this in an absolute term but with a specific idea in mind.

As a developer you will (or at least should) track your code with version control software like Git, Mercurial, SVN, etc. These applications have their own way to generate version numbers which are usually far from being intuitive, like generating random hashes.

If you would write your own software that only you would ever use this might already be enough. But as soon as there are multiple persons involved, people outside your developer circle, offering a human readable version number is a must.

There are basically three groups of target audiences that require that approach:

  1. Third-party developers
  2. Software/Usability testers
  3. Customers, users, admins

It’s been a trend lately to kinda pervert the versioning system by increasing the numbers so fast that they lose all the abilities to get at least some information other than “this version seems to be newer than the other”. The web browsers Chrome and Firefox are shiny (negative) examples of that! I would highly recommend either getting rid of version numbers completely or using a system that can make sense to almost everyone.

By far, the most common versioning pattern is {full version}.{feature}.{bug(fix)}({stage})

So when you update from version 1.2.0 to 1.5.2 you can immediately see that a few features have been added (or were being removed) and even some bug fixes have already been made. Or if you’re about to install 1.5.3b you should be aware that this software is still in a beta stage and that it might not be working 100% at this point. And so on and so on.

Often developers struggle with how to assign what number for what achievement. And there also seems to be a fear of making full version releases. When you take look at directories like WordPress.org/extend/plugins/ you will see a vast number of software that might even be out there for years that never seems to have reached version 1.0! It’s only my personal opinion but I think that’s just stupid. Maybe some developers think that a software still being a 0.5.22 is a better excuse if something’s wrong (“Dude, it’s not even the final version!”). But if you would take the term “final” literally that’ll be the stage of your software when it’s no longer being developed at all or supported. That’s ‘final’!

When you start a new software project you should do some planning ahead. And one part of that process is to make a list of features or functionalities that you think the first public release should have. Every software has a purpose and this core can be expected to be working when someone is installing your software.

So what I do is to rank the features in terms of importance and other dependencies and assign a number to each of them. So until the software is being released, the {full version} number will be 0 and the {feature} number will increase according to the number of finished features. If each feature of your list has been implemented you can do a major round of testing and documentation etc and if that’s also done, you can (and should) release version 1.0.

From here on, you can either react on feedback (bug reports and feature requests) or implement some new ideas by increasing the {feature} and {bug(fix)} numbers to an open end. And if you make some major changes like to the architecture or you’re implementing some new bigger features you might think (twice) about going for a new {full version} number.

Just as a reminder: Version numbers are not necessarily for you and your team of developers but for all the other people out there. So make them as clear and understandable as possible!

Other Versioning Patterns

So why are there even other patterns if the one mentioned above is meant to be so great? Well, other patterns have other intentions, simple as that.

For example, when using {yy}.{mm}.{dd}.{n} you can instantly see when a release has been made. This might be handy if you know that your environment has changed at some point in time. Or you can see when product has not been updated in ages – which then might be a good indicator that it’s not being supported any longer.

You can also combine these two aspects to a certain extend by adding the release year to your product name or starting the {full version} number with the year but keeping the rest as mentioned above.

And if you can be sure that the version number won’t matter to the average Joe or Jane you can get a bit more technical and include build numbers (which of course also depends on the type of programming you’re using).

So to draw a resume: Think about how you what the versioning should communicate and to whom this might be useful and don’t be clever about it, simplicity’s king here!

Programming Fonts

In September of last year (2012) Adobe released a free font that was especially created for software developers. Not only do I like the new font, called “Source Code Pro”, the blog post that went along with the publishing (http://blogs.adobe.com/typblography/2012/09/source-code-pro.html) was interesting to read, too.

I know a thing or two about fonts and typography (I highly recommend the documentary “Helvetica”!) and I even bought a book about this topic a while ago. But I’m far from being an enthusiast, especially when it comes to “new fonts” – mainly because there are way too many blog posts like “20 hottest fonts for X and Y”.

But like many things in life, you sometimes have a feeling about something being good or bad but you can’t put the finger on it why exactly to you it is. But the blog post is giving a nice, brief introduction to what’s important about a font developers use in a terminal application or an IDE, for example.

Well, the most obvious criterion might be that certain characters should be distinct so you wouldn’t confuse them with each other, like the letter i, l and also the | character (like in the OR operator ||) or the letter O and the digit 0 (zero). For example, the Adobe font has a little dot inside the digit 0 which might not be, well, beautiful but it definitely helps a lot!

Also all the different kinds of brackets (curly, round, square) should be distinct and so on and so on…

Maybe it’s just a personal taste but for example, I prefer the asterix to be centered, not superscript. And last but not least, and since code indention is an important part of creating clean code that’s easy read, the font type should be a monospace type.

“Source Code Pro” can be downloaded on the Adobe website so you can install it in your operating system but -and I really like that option a lot- it’s also available as a webfont (via Adobe Typekit as well as Google Web Fonts) so you can style your HTML <code> blocks, too.

PHP in HTML or HTML in PHP?

What’s nice about PHP (among other things) is that code blocks can easily be nested inside the HTML of a page. This way you in insert dynamic content just like that. Of course you can also go the other way and echo out HTML via PHP.

My simple rule is: When you have a clear overweight of one or the other use this as your main code and nest-in the other.

This makes source code way more readable than, for example, when you have orgies of opening and closing PHP tags inside “some” HTML tags!

Let me give you an example (do you recognize where it’s from?).

Original Code

<?php if (have_posts()) : ?>
<?php while (have_posts()) : the_post(); ?>
<div>
<?php the_content('Read the rest of this entry &raquo;'); ?>
</div>
<?php endwhile; ?>
<?php else : ?>
 <b>Not Found</b>
<?php _e("Sorry, but you are looking for something that isn't here."); ?>
<?php endif; ?>

Cleaned up code

<?php
if( have_posts() ) {

  while( have_posts() ) {
    
    the_post();

    echo "<div>\n";

    the_content( 'Read the rest of this entry &raquo;' );

    echo "</div>\n";

  }
}
else {
  
  echo "<b>Not Found</b>\n";
  echo __( 'Sorry, but you are looking for something that isn't here.' );

}
?>

Well, I’ve also done some more formatting (according to my coding standard) and replaced the (in my mind) horrible and old-school : ENDIF ENDWHILE grammar. But you get my point, right?

When you’re writing new code, always keep that in mind. And if you’re refactoring old (legacy) code this cleaning up should be your absolute first step!

2 Simple jQuery Hacks To Remember

It’s pretty common behavior to use the $ symbol (variable) as a shortcode instead of always writing “jQuery”.

But other JS libraries use this character as well. And sooner or later this might cause some weird and nerve-wracking-to-find bugs. And even if you’re not using other libraries at the moment, who says you won’t in the future…

To avoid potential conflicts you can simply decalare a local (scope) variable inside your document-ready block and then you’re good to go:


jQuery(document).ready(function(){
  var $ = jQuery;
  ...
});

And if you’re starting by selecting a certain element, you can speed up things checking its existence before letting the parser run all your code:


jQuery(document).ready(function(){

  var test = $('#myElement');

  /* if the element does not exist, don't even bother running the rest of the code... */
  if( test.length < 1 ) {
    return;
  }

  ...

}