An introduction to Checksums

Hi everyone

This is a quick introduction into checksums, and practically how to use them, at the request of someone through the Ask Johno section of the blog.

I know what a checksum is, but I’m not sure how to implement or use one (in PHP) to achieve what I need.

Anonymous Asker

They go on to explain what they are trying to achieve, which is essentially to verify a file hasn’t changed (in this case it’s the HTTP document sent by an API) before doing something.

Firstly, a checksum is a very small data snippet (datum) which represents something larger.

In my experience there are 2 main reasons to use one, these are;

  1. To verify a change in something larger
  2. To verify that a piece of data matches from that which was sent by its originator (checksum on an API payload)

I will very briefly cover both of these with PHP examples.

Note; I am using MD5 for simplicity, but depending on your requirements this likely won’t be the best option for your use case.

Edit: Please see Further Reading at the bottom of this article for more information about hashing

Using a Checksum to Detect Changes

Now I don’t know what we’ve got, but whatever it is; we are going to need a string representation, 2 main ways of getting this:

$stringRepresentation = serialize($myThingToCheck);

or

$stringRepresentation = json_encode($myThingToCheck);

Personally, I would advise PHP’s serialize because that will work with, and instantiate objects. However, the choice is yours depending on your use case, JSON is smaller than a PHP serialization.

Now that we have a string, we need to make it small and easy to check. Something like this will work fine;

$checksum = md5($stringRepresentation);

Now to check for changes, we just need the last checksum that we stored that something happened on.

if($checksum != $oldChecksum){
echo 'Something has changed';
}

So that covers how to use a checksum to detect changes in pieces of data, which is useful if you’re having to poll for changes.

Using a Checksum to Verify Validity

This is something I’ve noticed a couple of times, particularly when working in the financial industry, and around certain payment gateways.

It usually looks something like this; but it does change per API integration so be aware of that and follow their own documentation.

$secret = 'my_secret_api_key';

$payload = [
'foo' => 'Bar',
'another' => 'Thing',
'datetime' => '2018-12-25 00:00:00'
];

$jsonPayload = json_encode($payload);

$checksum = md5($jsonPayload . $secret);

$payload['checksum'] = $checksum;

// Do the rest of your stuff here, including sending the payload etc.

One advantage of this approach is that you never expose the API key in plain text.

If you were to want to verify on the API so that you’re the provider, rather than integration, you would simply do these steps in the opposite order, so it would look something like this

// Assuming you've done everything you need, and now have the $payload array/object back

$secret = 'the_secret_youre_expecting';

$checksumProvided = $payload['checksum'];
unset($payload['checksum'];
$checksum = md5(json_encode($payload) . $secret);

if($checksum != $checksumProvided){
// If the checksums don't match, in theory the secret key provided was incorrect
}

I think that about covers this topic, as a brief introduction to checksums, and how they’re often used in PHP.

Further Reading

PHP: hash() for more information about the best ways to hash data

I have deliberately not gotten into the discussion over hashing algorithms, as it really is out of the scope of this article, and indeed a whole book could be written on the topic alone.

Thanks to u/artemix-org and u/BradChesney79 on Reddit for suggesting this edit.

RESTful APIs – An accurate description

Hi everyone

Today I thought I would do a quick post to cover RESTful APIs, and what they are. The reason for this article is that I have, on numerous occasions, encountered developers (and indeed whole teams) who have misunderstood this concept at its very core. This causes a number of problems, firstly if you don’t understand RESTful APIs fundamentally, you’re likely to encounter integration issues quite early on. Secondly, unfortunately, if you’re a candidate interviewing for a role and haven’t understood what a RESTful API is properly, you’ll come unstuck in interview.

What RESTful is not (necessarily)

  • A buzzword for a JSON API
  • An API with obscure functionality

What a RESTful API is, and what it has

  • REST is REpresentational State Transfer, RESTful is an adjective, so a RESTful API is an API which subscribes to the REST principles
  • The purpose of REST is to ensure that APIs are easy to understand at a universal level
  • RESTful APIs will have end points which represent entities
  • Those endpoints will respect HTTP methods (also referred to as verbs) to represent the actions you wish to take

Okay, so explain to me the HTTP methods/verbs

  • GET requests represent reading this entity/collection from the data source
  • POST requests represent creating an entity/collection in the data source
  • PUT requests update (destructively, replacing) the entity in the data source
  • PATCH requests update (partially) the entity in the data source
  • DELETE requests remove the entity from the data source

How does that work in practice?

Let’s use the example of a payment provider, you may have entities such as

  • api/customer
  • api/payment
  • api/payment/paymentId/refund

You will always need a specification, but theoretically you know that if you send a POST to endpoint/customer you will create one, if you send a PATCH you’ll partially update that customer. If you send a POST to api/payment you will create a payment, for which you will receive a reference (ID), and if you were to send a POST request to api/payment/id-you-received/refund then you would create a refund against the payment which you specified in the payment ID.

So what’s the point?

The main idea behind RESTful APIs, the same as with other standards such as PSR-X, is to unify the way in which we build APIs, so if I were to say, for example, to an organisation with whom I am going to be integrating “we have a RESTful API you can integrate with” they know, with some level of certainty, how much work there is involved in working with it – they also do not have to have an in depth understanding of my local design, architecture, etc, because the RESTful design abstracts any need for that knowledge.

In summary

RESTful APIs are great, other APIs which are not RESTful can also be great. I just wanted to help inject some clarity on the topic, although, of course, there is plenty of information on the internet around this particular methodology. If one person reads this article and gains an actual understanding of what RESTful is, then it has served its purpose.

Edit: HATEOAS further reading

Valid point raised by DarkTechnocrat on Reddit, for further reading beyond a very basic understanding on REST, you probably should read up on HATEOAS – at a quick glance this article on spring.io looks like a good place to start 🙂