In this post, we look at how to use CakePHP’s Queue Plugin to efficiently manage and execute background tasks. We’ll explore the structure, implementation, and benefits of this approach, with a focus on the src/Job
directory and related components of Willow CMS.
Overview of CakePHP Queues and Jobs
CakePHP provides a robust Queue plugin that allows developers to offload time-consuming or resource-intensive tasks to background processes. This is particularly useful for operations that don’t need to be executed immediately or that might slow down the user experience if performed synchronously.
The Queue Plugin extends CakePHP’s capabilities by providing a flexible and efficient way to manage these background jobs.
Willow CMS Job Structure
Let’s examine how Willow CMS organises its job-related code:
- All job classes are located in the
src/Job
directory - Each job is defined as a separate class, inheriting from a base Job class provided by the Queue Plugin
- Jobs are typically named with a descriptive suffix, such as
UpdateJob
,AnalysisJob
, orProcessJob
Here’s the directory structure:
src/
└── Job/
├── ArticleSeoUpdateJob.php
├── ArticleSummaryUpdateJob.php
├── ArticleTagUpdateJob.php
├── CommentAnalysisJob.php
├── ImageAnalysisJob.php
├── ProcessImageJob.php
├── SendEmailJob.php
├── TagSeoUpdateJob.php
├── TranslateArticleJob.php
├── TranslateI18nJob.php
└── TranslateTagJob.php
This makes it easy to locate and manage different types of background tasks within the application.
Queuing Jobs in Willow CMS
Willow CMS uses the Queue Plugin to enqueue jobs for later execution. This is done by putting messages onto a queue. The message can contain data that the Job will need to do its thing. Let’s look at how this is typically done:
In src/Model/Table/ArticlesTable.php, we can find examples of queuing jobs:
$this->queueJob('App\Job\ArticleTagUpdateJob', $data);
This line of code queues an ArticleTagUpdateJob
with the provided $data
. The queueJob
method is a wrapper around the Queue Plugin’s job creation functionality, simplifying the process of adding jobs to the queue. In future I will refactor this since we use the same method in the TagsTable too.
Job Types and Their Purposes
Let’s explore some of the job types in Willow CMS and their purposes:
-
ArticleSeoUpdateJob: Updates SEO metadata for articles using the Anthropic API service. It retrieves the article, generates SEO content (such as meta descriptions and keywords) using AI, and updates the article with the new SEO metadata. The job also handles error logging and cache clearing. View the source here.
-
ArticleSummaryUpdateJob: Generates and updates article summaries using the TextSummaryGenerator, which uses the Anthropic API service. It retrieves the article, generates a summary and lede (if they don’t exist), and updates the article with the new content. The job also handles error logging and ensures that existing summaries are not overwritten. View the source here.
-
ArticleTagUpdateJob: Updates tags associated with articles using the Anthropic API service. It retrieves the article, generates new tags based on the article content, and updates the article with the new tags. The job handles both root-level and nested child tags, and includes error logging and cache clearing. View the source here.
-
CommentAnalysisJob: Analyzes comments for sentiment analysis/spam detection using the Anthropic API service. View the source here.
-
ImageAnalysisJob: Performs analysis on images for alt text generation and keyword tagging for search using using the Anthropic API service. View the source here.
-
ProcessImageJob: Handles image processing tasks like resizing, format conversion, or optimisation. View the source here.
-
SendEmailJob: Manages the sending of emails, allowing for better handling of large batches of emails or retries on failure. View the source here.
-
TagSeoUpdateJob: Updates SEO information for tags, generating meta descriptions and related SEO related content using the Anthropic API service. View the source here.
-
TranslateArticleJob, TranslateI18nJob, TranslateTagJob: Handle translation tasks for articles, internationalisation strings, and tags respectively using the Google Cloud Translate API. View the source for TranslateArticleJob, TranslateI18nJob, TranslateTagJob.
Let’s take a closer look at the ArticleTagUpdateJob
, which is responsible for updating the tags of an article using the Anthropic API:
declare(strict_types=1);
namespace App\Job;
use App\Service\Api\Anthropic\AnthropicApiService;
use Cake\Cache\Cache;
use Cake\Log\LogTrait;
use Cake\ORM\Entity;
use Cake\ORM\Table;
use Cake\ORM\TableRegistry;
use Cake\Queue\Job\JobInterface;
use Cake\Queue\Job\Message;
use Interop\Queue\Processor;
class ArticleTagUpdateJob implements JobInterface
{
use LogTrait;
public static int $maxAttempts = 3;
public static bool $shouldBeUnique = true;
private AnthropicApiService $anthropicService;
public function __construct(?AnthropicApiService $anthropicService = null)
{
$this->anthropicService = $anthropicService ?? new AnthropicApiService();
}
// Jobs must have an execute method which is the default entry point for the task
public function execute(Message $message): ?string
{
// The message will contain the data we set when queuing the message, so get it
$id = $message->getArgument('id');
$title = $message->getArgument('title');
// Logging is always useful and will show up in the Admin Logs area...
$this->log(
sprintf('Received article tag update message: %s : %s', $id, $title),
'info',
['group_name' => 'App\Job\ArticleTagUpdateJob']
);
// You can load models to work with data from a Job
$articlesTable = TableRegistry::getTableLocator()->get('Articles');
$tagsTable = TableRegistry::getTableLocator()->get('Tags');
// And use models just like you would anywhere else in a CakePHP application
$article = $articlesTable->get(
$id,
fields: ['id', 'title', 'body'],
contain: ['Tags' => ['fields' => ['id']]]
);
$allTags = $tagsTable->getSimpleThreadedArray();
try {
// Making use of the anthropic service class
$tagResult = $this->anthropicService->generateArticleTags(
$allTags,
(string)$article->title,
(string)strip_tags($article->body),
);
if (isset($tagResult['tags']) && is_array($tagResult['tags'])) {
$newTags = $this->processAndSaveTags($tagsTable, $tagResult['tags']);
$article->tags = $newTags;
if ($articlesTable->save($article, ['validate' => false, 'noMessage' => true])) {
$this->log(
sprintf('Article tag update completed successfully. Article ID: %s', $id),
'info',
['group_name' => 'App\Job\ArticleTagUpdateJob']
);
Cache::clear('articles');
return Processor::ACK;
}
}
} catch (Exception $e) {
$this->log(
sprintf(
'Article Tag update failed. ID: %s Title: %s Error: %s',
$id,
$title,
$e->getMessage(),
),
'error',
['group_name' => 'App\Job\ArticleTagUpdateJob']
);
}
// Something went wrong if we got here so tell he queue plugin to reject this run of the job
return Processor::REJECT;
}
// Other helper methods... see the full source on GitHub
}
This job demonstrates several aspects of Willow CMS’s jobs setup:
-
API Integration: It uses an external API (Anthropic) to generate tags for articles, showcasing how background jobs can integrate with third-party services.
-
Error Handling: The job incorporates comprehensive error handling via the Queue Plugin, which offers mechanisms for managing job failures, retries, and logging.
-
Logging: Extensive logging is implemented throughout the job, providing valuable insights for monitoring and debugging.
-
Return Values: The
execute
method returnsProcessor::ACK
on success andProcessor::REJECT
on failure, allowing the queue system to handle retries and failures appropriately. -
Database Operations: It interacts with multiple database tables (Articles and Tags) to fetch and update data.
-
Job Configuration: The class defines properties like
$maxAttempts
and$shouldBeUnique
to control job execution behavior. -
Dependency Injection: The job uses constructor injection to receive necessary dependencies, such as the AnthropicApiService. This makes it easy to switch to another AI service from Google, OpenAI etc.
-
Caching: After successful tag updates, the job clears the articles cache to ensure fresh data is served to users.
-
Single Responsibility: The job class is responsible for the single task of maintaining tags related to an article, making the code more maintainable and easier to understand.
-
Separation of Concerns: By moving this relatively time-consuming task into a background job, the save article action remains clean and responsive.
Job Configuration and Triggering
Willow CMS uses the Queue Plugin’s configuration options to fine-tune job execution. Here are some key aspects of job configuration and triggering:
-
Queue Configuration: The Queue Plugin is configured in the application’s
config/app.php
file. This configuration includes settings such as the queue engine (e.g., Redis, Database), worker options, and retry strategies. -
Job-Specific Settings: As seen in the
ArticleTagUpdateJob
example, individual job classes can define specific settings:public static int $maxAttempts = 3; public static bool $shouldBeUnique = true;
These settings control the maximum number of retry attempts and ensure that only one instance of the job is in the queue at a time.
-
Queueing and Triggering Jobs: Jobs are queud in response to specific events or actions within the application. For example:
- After creating or updating an article:
$this->queueJob('App\Job\ArticleTagUpdateJob', ['id' => $article->id]);
- Afteruploading an image:
$this->queueJob('App\Job\ProcessImageJob', ['image_id' => $image->id]);
- After creating or updating an article:
Jobs are triggered using the Queue Worker. On the development environment the Queue Worker is run manually on the command line - see the ReadMe. On a production environment the Queue Worker is run automatically via Supervisord
- Job Data: The message put onto the queue can contain data which can be accessed within the job by using the
getArgument()
method on the message. To keep things simple I only put the bare minimum information needed for the job in the message, in this case the ID of the model record the job will work with and the title so that if anything goes wrong the logs can include some helpful information.
$id = $message->getArgument('id');
$title = $message->getArgument('title');