An introduction to Checksums

Hi everyone

This is a quick introduction into checksums, and practically how to use them, at the request of someone through the Ask Johno section of the blog.

I know what a checksum is, but I’m not sure how to implement or use one (in PHP) to achieve what I need.

Anonymous Asker

They go on to explain what they are trying to achieve, which is essentially to verify a file hasn’t changed (in this case it’s the HTTP document sent by an API) before doing something.

Firstly, a checksum is a very small data snippet (datum) which represents something larger.

In my experience there are 2 main reasons to use one, these are;

  1. To verify a change in something larger
  2. To verify that a piece of data matches from that which was sent by its originator (checksum on an API payload)

I will very briefly cover both of these with PHP examples.

Note; I am using MD5 for simplicity, but depending on your requirements this likely won’t be the best option for your use case.

Edit: Please see Further Reading at the bottom of this article for more information about hashing

Using a Checksum to Detect Changes

Now I don’t know what we’ve got, but whatever it is; we are going to need a string representation, 2 main ways of getting this:

$stringRepresentation = serialize($myThingToCheck);

or

$stringRepresentation = json_encode($myThingToCheck);

Personally, I would advise PHP’s serialize because that will work with, and instantiate objects. However, the choice is yours depending on your use case, JSON is smaller than a PHP serialization.

Now that we have a string, we need to make it small and easy to check. Something like this will work fine;

$checksum = md5($stringRepresentation);

Now to check for changes, we just need the last checksum that we stored that something happened on.

if($checksum != $oldChecksum){
echo 'Something has changed';
}

So that covers how to use a checksum to detect changes in pieces of data, which is useful if you’re having to poll for changes.

Using a Checksum to Verify Validity

This is something I’ve noticed a couple of times, particularly when working in the financial industry, and around certain payment gateways.

It usually looks something like this; but it does change per API integration so be aware of that and follow their own documentation.

$secret = 'my_secret_api_key';

$payload = [
'foo' => 'Bar',
'another' => 'Thing',
'datetime' => '2018-12-25 00:00:00'
];

$jsonPayload = json_encode($payload);

$checksum = md5($jsonPayload . $secret);

$payload['checksum'] = $checksum;

// Do the rest of your stuff here, including sending the payload etc.

One advantage of this approach is that you never expose the API key in plain text.

If you were to want to verify on the API so that you’re the provider, rather than integration, you would simply do these steps in the opposite order, so it would look something like this

// Assuming you've done everything you need, and now have the $payload array/object back

$secret = 'the_secret_youre_expecting';

$checksumProvided = $payload['checksum'];
unset($payload['checksum'];
$checksum = md5(json_encode($payload) . $secret);

if($checksum != $checksumProvided){
// If the checksums don't match, in theory the secret key provided was incorrect
}

I think that about covers this topic, as a brief introduction to checksums, and how they’re often used in PHP.

Further Reading

PHP: hash() for more information about the best ways to hash data

I have deliberately not gotten into the discussion over hashing algorithms, as it really is out of the scope of this article, and indeed a whole book could be written on the topic alone.

Thanks to u/artemix-org and u/BradChesney79 on Reddit for suggesting this edit.

What is a memory leak? A quick analogy

This is something that came up in conversation, some friends and I were discussing deploying code, that runs in the background, to production environments.

One of the things I raised was what can happen with daemon processes, should you have a very small inefficiency, given enough time to run (usually by the time it gets to production) it can, and will, destroy live servers.

I then realised that, at this point in the conversation, a description of what a memory leak is, had become an appropriate thing to explain.

Anybody who knows me, knows I love an analogy; so this is the analogy I gave, to give a really simple explanation as to what a memory leak is:

Every morning, you go to a fast food drive through.

You order a meal, eat it, and throw the paper bag with some leftovers into the passenger footwell.

At the end of the day, you arrive at home. You pick up the bag of rubbish, and put it into your bin. Without realising it, you drop a single french fry in the car.

In development and testing you run this same process 50 times, dropping a single french fry each time. The fries are not visible, they’re under the passenger seat, or with the momentum of the car have ended up in the back.

When you go live, the process runs more frequently, and instead of a single meal you’re buying 10 at a time.

Very quickly, those single french fries culminate in an unusable car, because you can’t fit in a Honda Civic if it has 1,000,000 festering french fries inside.

Matt “Johno the Coder” Johnson, on a cold winter morning

So there it is, a quick explanation of what a memory leak is, in an easy to understand analogy.

I’ve been asked to do walk throughs on practical implementations on daemons and a few other topics, so I am going to write those up soon.

Practically, what might it look like?

Imagine your daemon script looks something like this…

// Store the jobs that have been processed
$jobsProcessed = [];

// This is a daemon script, it needs to run, forever
while(true){

// This is just for demonstration purposes!
$job = getNewJob();

// Do whatever you need to, to handle the job

// Let's store the job we've processed
$jobsProcessed[] = $job;

}

This all looks fairly innocent right. In testing there are probably, at tops, a few thousand test jobs. In production, when this is running forever, that very small array, can become very big. That’s the bit that could cause a server to topple.

For reference, if you do need to keep this information, store it somewhere, anywhere, else. A log file is usually a good shout (as long as you’re periodically cleaning out your log files), perhaps a database (I’d recommend a MyISAM table for this, as you’re dumping a whole load of plain text data). If you keep this information in a variable in your script, it’ll hold in memory, which is exactly where you don’t want it.

So there it is, a quick and easy analogy, with an (overly) simplified example of what it might look like.

Beware, the Anti Pattern!

In this article I am going to cover the application of patterns within your… application. In short, I am going to show you how to use design patterns in a logical manner.

Patterns are always sometimes awesome!

The first thing here is that all design patterns have a purpose, every design pattern has its place. Whilst I’m talking about design patterns the same can be said of development methodologies, database designs, and really any other kind of concept.

The proper application of design patterns can take a frustrating piece of software, and make it easy to maintain, or hyper-secure, or crazy performant.

The bad application of a design pattern will do the exact opposite.

But, why would you ever not use a pattern?

Well, when it’s not the right time to use it. Every pattern has its place, but not every pattern should be used everywhere, it’s a bad idea.

Don’t take my word for it, let’s look at some practical examples and experiences that really demonstrate what I am saying here.

I’m going to be deliberately controversial here, and I’m going to pick stuff that all developers seem to love and then prove where it will ruin your application.

I’ve heard of Patterns, what is an Anti Pattern?

An anti-pattern is a pattern. Subjectively an anti-pattern is when you take a pattern and either apply it in the wrong circumstances, or implement it badly.

The consequence of the application of an, otherwise great, pattern, causes adverse impacts (usually for maintainability, or for performance etc). Now, it is an anti-pattern.

Model View Controller (MVC)

Well all love MVC, right? I mean, what’s not to love? CodeIgniter, CakePHP, Laravel, Angular, and Joomla all follow this pattern. It is arguably one of the most used design patterns of recent times.

For good reason, it’s awesome! Because it’s awesome, developers are pretty darn impassioned about using it everywhere, and they are right, in the vast majority of cases.

So when, I hear you ask, would this not be a good idea? When could it be an anti-pattern?

What about if I am writing a daemon, which is going to continuously monitor the usage of the mounted hard drives on a server and, in certain scenarios, send an email to a system administrator?

Don’t need models – we’re not handling any data. We don’t have anything, not even a CLI output, so no need for views. Realistically there’s not a controller it’s a standalone script. MVC would be a bit overkill, in this instance, wouldn’t it?

That’s a bit of a drastic example though, let’s look at something more subtle.

Single Responsibility Principle

Ah, right. Let’s get controversial then, shall we?

A class should only have one reason to change

Single Responsibility Principle, SOLID Principles

Everyone loves this, and likes to really preach about it. It is crazy controversial, widely adopted, and I personally think that it’s a good idea.

The Single Responsibility Principle is like the best song in the world, that you hear 1,000 times per day on every music channel and radio station. It is best practice, yes. But sometimes, it’s okay to break it!

Oh Gosh, quick, get the smelling salts and a wet flannel, they’ve feinted!

The whole point of the SOLID principles are to make well structured, easy to maintain code.

So, time for a real-life example. I have a class, it is an Eloquent Model. It is responsible for some mission-critical, core functionality. This class has a method within it. This method is 150 lines long (probably 50 lines of code, once you take out empty lines and code comments).

This method, in of itself, could (and maybe even should) be abstracted into it’s own class. For some context it is a static method, responsible for fetching records, based on a set of arguments.

Every part of the Single Responsibility Principle dictates this method should be abstracted to its own class, perhaps even a set of classes.

Internally, I have had this code reviewed, and to quote the developers who have checked (and indeed worked on) it, it is “exceptionally easy to follow” and “super easy to add and change the functionality”.

It is, essentially, a set of if statements. Based on the outcome of those if statements, the Eloquent Query is modified, then returning either limited, or paginated, results.

So, I hear you scream, “why won’t you abstract it?!” – well I could. But following conversations with the development team, the code would actually be harder to follow if I were to abstract and refactor it.

In this instance, the code can be easily found, easily changed, and is incredibly stable.

Following Single Responsibility to the letter, this time, would be an anti-pattern. Rather than making life easier, it would make life more difficult. Abstraction can be bad! If it has no performance, functional, or security perks. It makes the code more difficult to maintain and follow; then abstracting it would not make any sense. Except to follow a pattern for the sake of following a pattern, at this point, it becomes an anti-pattern.

Dependency Injection

Oh God. I’ve been here before. It is bad. I remember the flames, vividly. As the Reddit Hellfire engulfed my computer. Just kidding, but this one does really evoke emotional reactions.

Just quickly and very simply, dependency injection is parsing an object (dependency) into another object on which the latter depends. Thus, injecting the dependency. It usually looks (forgetting containers and autowiring) something like this.

class Calculator{

public function __construct(
AdditionServiceProvider $addition,
SubtractionServiceProvider $subtraction
){
$this->additionService = $addition;
$this->subtractionService = $subtraction;
}
}

The point here is that I can control the services that the Calculator is using. So if I were, at a later date, wanting to swap out my AdditionServiceProvider for AcmeIncAdditionServiceProvider (assuming they implement the same interface, or extend the same base class) then I could, and the rest of the class would work as expected.

However, I have, several times, seen things like this.

class PaymentsIncorporatedWrapper{

public function __construct(
PaymentsInc\Payer $payer,
PaymentsInc\Refunder $refunder
){
$this->payer = $payer;
$this->refunder = $refunder;
}

}

Right, this makes sense, doesn’t it? Wire up the payer and the refunder, then drop them into your Wrapper. As I said above, take the dependency injection container side of it; whenever I want to do something with the PaymentsIncorporatedWrapper I have to do something like

$payer = new PaymentsInc\Payer($apiKey, $somethingElse);
$refunder = new PaymentsInc\Refunder($apiKey, $somethingElse);
$wrapper = new PaymentsIncorporatedWrapper($payer, $refunder);

That’s a kind of annoying amount of code to write, to instantiate a class. “But the container does that for you!” I hear you scream. Yes, yes it will.

But why? You don’t know, at the time of instantiation, that I need the refunder. I might just be querying a payment. Why do I need the refunder? I don’t. Ah, maybe this is an anti-pattern.

Also, this set of classes is specific to the PaymentsInc integration. So it’s not like I’m going to swap in another payment provider (otherwise this would all make sense).

In this instance, when I couldn’t possibly want to swap anything else in/out, perhaps this would make more sense?

class PaymentsIncFactory{

public static function getPayer() : PaymentsInc\Payer
{
$factory = new static();
return $factory->getPayer();
}

private function getApiKey() : string
{ ... }

private function getSomethingElse() : string
{ ... }

public function getPayer() : PaymentsInc\Payer
{ ... }

...

}

class PaymentsIncoporatedWrapper{

public function takePayment(float $amount)
{
$payer = PaymentsIncFactory::getPayer();
$payer->takePayment($amount);
}

}

Anybody who is passionate about Dependency Injection will argue this is wrong, and they will probably cite Unit Testing as the reason. But, to my knowledge, unit testing isn’t justification for using Dependency Injection.

In fact, I have implemented both Unit Testing, and Test Driven Development, without unnecessarily using Dependency Injection. Of course, Dependency Injection was used, just only where it was truly necessary.

And the point of those tales was….

Just because a pattern is the best thing since sliced bread, doesn’t mean you should apply it liberally, everywhere, without thinking about it.

In the above I’ve taken three of the most beloved patterns we possess, and given you three good places where perhaps those pattern were best not applied.

So think about the patterns you’re using, never blindly use it because someone on [insert social website or influencer here] said is is amazing.

The key, as with all things, is to genuinely understand the pattern, its application, its benefits, and its constraints. And then think, and make a decision, about whether it makes sense to apply it in your use case.

Further Reading / Sources

  1. Model View Controller (MVC) – Wikipedia
  2. Single Responsibility Principle – Wikipedia
  3. Dependency Injection – Wikipedia

Practical Implementations of Agile Software Development

So, Agile. It seems like a bit of a buzz word, and my favourite experience of someone completely missing the concept of Agile was someone I interviewed, I won’t name them, but our conversation went something like this;

“Do you have any experience working in Agile?”

“Is there any other way to work?”

Okay, this sounds promising! The blank expression on this candidate’s face when I mentioned things like Scrum, Velocity, and Retrospective showed me that the misconception here was that “Agile” simply meant “working to changing specifications”.

Now, moving into Agile can be a scary and daunting task, and the Agile manifesto is a bit obscure, as are the principles of Agile. So the aim of this article is to give some tiny snippets as to how you can make start to (and you may already be!) adhere to this manifesto and these principles.

My plan, for this article, is to demonstrate the manifesto, and how that should worm its way into your every day working!


Value individuals and interactions over processes and tools

Let’s just let that sink in for a minute. Anybody who has worked with me, especially those who have worked with me at Speed, will agree that I love a flow chart, I love things working like a machine; so this one took a little while to sit properly with me.

The more I think about that “value individuals” the happier it makes me. What this means, is we need to collaborate, understand the skills everyone brings to the table.

This one is arguably the easiest one to achieve, but when you’re mulling over specifications and features, get everybody involved. Start to snowball, get creative. Sit everybody around a table with a whiteboard or some post it notes (Jay Heal loves a good post-it session! Or a workshop as they are more accurately described)

People involved in brand will have different thoughts to offer from those thinking of the product from a marketing perspective, which will be different from the UX gurus, and the frontend ninjas, and the data scientists, and the software engineers.

Magic happens when these people are collaborating, and are each individually valued.

Value working software over comprehensive documentation

Again, this was a culture shock to me. I like my software specifications to be really, really comprehensive. However, drawing up these documents take time, and from a technical delivery standpoint there’s very little value, actually, it’s more valuable from a commercial standpoint.

So, what’s the alternative? User story planning. Rather than drawing up a million specification points, let’s plan the user’s story, and build that.

This pulls really nicely into Test Driven Development, and into YAGNI (You Aren’t Gonna Need It), because you develop and delivery what’s needed.

It is better to deliver small pieces of working software, frequently, than spend months hashing out the too-fine details of a contract and specification.

From a commercial standpoint this sounds counter-productive, but hear me out; if you keep delivering working software, you are going to avoid cash-flow problems where you’re waiting to get the project signed off, and consequently paid for. Additionally, you’re going to continue to gain the confidence of your client, so they’re going to be far, far more inclined to continue to invest into a working relationship, than a black hole of “you’ll get to see it one day” development team.

Practically then, what do we do? Go through the wireframes and the high level conceptual specification, enough to know what you need to develop. Then start coding, as soon as you have enough to work on, start coding something, and make sure you show the client. Prototypes are acceptable, but have something to show and deliver. Don’t wait for an unveil moment, make the client part of your development process.

Side note; you’re also likely to avoid costly test cycles and amendment rounds at the end of a project by working like this.

Value customer collaboration over contract negotiation

This leads on really nicely from the last point. Collaborate with your customer. They know their business, as well as your team know the technology.

Rather than having “no” conversations, have collaborative conversations. Using velocity and backlogs and the point system, once the client has bought into the concept themselves (which you might have to do), they can make decisions about how to spend the resources they’re paying for.

Let the sky be blue, lets the creativity flow.

Practically, liaise with your clients almost constantly. Depending on your environment you probably can’t have them in your scrum every morning, but you can certainly keep them involved. Collaborate with your clients, don’t just negotiate terms.

Value responding to change over following a plan

This is the key thing with Agile, that makes it considerably different from other project management methodologies, like Waterfall.

This is where your sprints, and an accurate velocity, come into their own. If everything is planned and scoped (with points) in your backlog, and you know the velocity of your team, changing direction on a penny shouldn’t be a problem.

Adjust your sprint, and continue collaborating. I’ve been on projects where the entire success of the project has been based on my team’s ability to change direction in an instant, to match a new opportunity, respond to a threat, or based on a new strength or weakness identified.

This is about letting the deliverable lead the project, not the commercials. By working against sprints, you can change direction without necessarily worrying about commercials in the first instance.


Further Reading

Agile Manifesto courtesy of AgileManifesto

Agile Principles courtesy of The Agile Alliance

Featured image credit: Agile Alliance (it is their logo)

Amazon Web Services – EC2 Quickstart

Good afternoon everyone

Today I am going to give a super-quick recipe to get your first EC2 up and running; if you’ve worked with virtual or dedicated servers before the cloud architecture thing can feel a bit overwhelming, or if you’ve not had to manage your own environments before. In any case, as you don’t get to install your own OS it can feel disorientating as you’r

This guide does not encourage best practices, it’s simply enough to get you running.

I am assuming you already have a rough understanding of what an EC2 is, and have signed up for your AWS account.

  1. Go to the AWS EC2 dashboard
  2. Click “Launch Instance”
  3. Click the “Select” button for the top row (Amazon Linux 2 AMI (HVM), SSD Volume Type)
  4. You now have some options to configure your instance, for the purposes of this article, I’m simply selecting t2.micro which is a free-tier general purpose EC2 – when you are setting up production environments, make sure you actually read through these options and select the appropriate decisions
  5. Click “Configure Instance Details” in the bottom right corner
  6. There are a whole bunch of options here which, of course, are really important for production EC2, however explaining these options are outside of the scope of this quick-start guide
  7. Click “Add Storage Details” in the bottom right corner
  8. Here you can configure the details of the storage you want your EC2 to have, for the purposes of this guide I’m going to keep the default 8gb (as I’m not going to need anything more than that)
  9. Click “Add Tags”, here you can add some tags to your instance for management and administrative purposes, again, this is out of scope for this tutorial
  10. Click “Configure Security Group”, from here you can configure the security policies around your EC2
  11. You will see a single rule configured for port 22 (SSH) connections, which allows all inbound traffic. I would advise changing the source. You can either do “My IP” which will detect and utilise your current IP, or you may wish to add further rules for multiple IPs/nets
  12. I am making the assumption you want to allow HTTP traffic to connect to your EC2 – click “Add Rule” in the bottom left, and select HTTP (this will allow inbound connections on port 80), if you want to allow HTTPS traffic (let’s be honest, all traffic should be HTTPS, it’s 2018) then you’ll need to add that rule, too, as it will allow traffic on port 443
  13. Continue adding rules as appropriate to allow connections to your instance
  14. Now you can “Review and Launch” – confirm your details and hit “launch”
  15. When you hit launch you will be prompted to either create a new key/value pair, or to utilise an existing one. Select new (which you will need to do unless you’ve set up one with AWS previously)
  16. Wait for your instance to be launched

Okay, so you now have an instance, and you’re going to want to do some stuff with it, presumably. All we’re going to do is shell into the server, and install Apache; then we’re going to point a DNS record so that web traffic hits that EC2.

  1. Check the box next to your corresponding EC2 and you’ll get some details in the bottom panel
  2. Click connect in the top bar and you’ll see some details, something like
    ssh -i "the-name-of-your.pem" ec2-user@ec2-xx-xx-xxx-xxx.eu-west-2.compute.amazonaws.com

    Making sure that the .pem file is pointing to the location you’ve stored your file from point 15 above. Now you will be shelled into your server.

  3. You will be prompted to fire a yum update
    sudo yum update
  4. I find the following steps get annoying unless I su to root
    sudo su root
  5. Install your web server
    yum install httpd
  6. Throw in a very simple virtual host declaration, just to accept web traffic (as I say – this is not best practice, it’s just to get you with a web-facing EC2!)
    nano /etc/httpd/conf/httpd.conf
  7. At the bottom of this file add something that looks roughly like this
    <VirtualHost *:80>
        ServerName your.domain.or.the.ec2.provided
        DocumentRoot /var/www/html
    </VirtualHost>
  8. Now you have a vhost to accept some web traffic, come out of nano and start httpd
    service httpd start
  9. Make sure the appropriate DNS records are set to point to either the IP address or CNAME set to the AWS subdomain (or you’re host file hacked)
  10. Visit the domain name you set in point 7 above

Voila! Very simple, not production ready, but you do now have an EC2 running and accepting web traffic, on the domain/subdomain of your choosing.

Until next time
JTC out