Data Repositories in PHP

I've always been a bit wary about ORMs. While I understand the appeal of modeling a data structure in the application and how much fine-grained control that gives you, it just feels like a heavy-handed solution. It's a lot of extra code to write and maintain and to build it out 'properly', with query-building options and all, is unwieldy. Instead of modeling out a data structure I prefer wrapping things in repositories.

To be honest I didn't know what a domain repository was until a few weeks ago. I used to call them collectors in previous application builds. Basically, it is a way to encapsulate the retrieval of data that follows certain business rules. I don't want a controller or view layer knowing about SQL, I just want to be able to say 'give me all the published records' and get a collection back. Or to say 'give me the blog post that matches this URL pattern'.

This seems a lot more useful than a typical ORM. An 'Active Record'-style model would abstract away a certain column name, like 'is_published' into a method, but the application would still need to know what method to call to build the query. Having all that logic tucked away inside a repository keeps the application delightfully naive about which method or column name determines the state of a record. The business logic for data definition is kept in one tidy domain-driven area.

One of the big benefits of domain repositories is switching out the data source via interfaces. You can define a standard set of methods that are useful to have for the type of data and then enforce them with an interface. Then, each storage engine would build off of that template. This makes sense, as the business logic surrounding the possible states of data can change based on the storage. Here's an example interface/object that I'm working on with this blog.

  1. // first the interface

  2. interface PostRepositoryInterface

  3. {

  4. public function getActivePosts($limit = null, $offset= 0);

  5. public function getActivePostsCount();

  6. }

  7. // then the storage-specific implementation

  8. class MysqlPostRepository implements PostRepositoryInterface

  9. {

  10. public function getActivePosts($limit = null, $offset = 0)

  11. {

  12. $query = "

  13. SELECT `id`, `title`, `path`, `date`, `body`, `category`

  14. FROM `jpemeric_blog`.`post`

  15. WHERE `display` = :is_active

  16. ORDER BY `date` DESC";

  17. if ($limit != null) {

  18. $query .= "

  19. LIMIT {$offset}, {$limit}";

  20. }

  21. $bindings = [

  22. 'is_active' => 1,

  23. ];

  24. return $this->db->fetchAll($query, $bindings);

  25. }

  26. public function getActivePostsCount()

  27. {

  28. $query = "

  29. SELECT COUNT(1) AS `count`

  30. FROM `jpemeric_blog`.`post`

  31. WHERE `display` = :is_active";

  32. $bindings = [

  33. 'is_active' => 1,

  34. ];

  35. return $this->db->fetchValue($query, $bindings);

  36. }

  37. }

This is a fairly basic interpretation of the repository pattern - using fancier things like gateways or entity mappers you can add some more flair. This allows me to define certain types of data collections, enforce them across storage options, and then deal with boring flat arrays when everything comes back. And I don't have to rewrite the same query -or- a obfuscated query-building structure multiple times across my application.

There are two things that I'm still a bit iffy on with repositories. The first is storage-specific instantiation. While I like the idea of type hinting by interface there will be a spot in the application that will literally call 'MysqlPostRepository', which just feels dirty. I'm not sure yet if it's worth an additional layer to abstract that out yet, it just feels a bit dirty. The other is that each 'type' of repository will only have one option to pull from. It'd be nice to say 'give me all the published records' and have something reach out to a cache layer, with on failure reach to a database, but again this could be solved with an additional layer. Eh, a future enhancement to explore.

One side benefit that I like about these is the testing aspect. The way the class is setup it is fairly easy to create a data structure that follows certain patterns and then throw it at these repositories to see what comes out. I'll be covering how to test domain repositories in the next blog post.