Should I Push or Should I Pull?
<insert_should_i_stay_or_should_i_go_guitar_riff>
You may say “who asked?” - my answer would be “I did”. After attending the DevOpsDays in Amsterdam this year, I was inspired by many talks and open spaces, many revolving around the common theme of GitOps.
GitOps has rapidly gained popularity as an approach for managing infrastructure and deployments. By leveraging Git as the source of truth for declarative infrastructure and application configurations, it provides a streamlined and version-controlled method for operational management.
In this post, I’ll explore the differences between push-based and pull-based deployment models and discuss their pros and cons.
Exploring GitOps
The typical GitOps workflow involves a Git repository that stores the desired state of the infrastructure and applications. The GitOps operator continuously monitors the repository for changes and updates the cluster state accordingly.
I felt the need to explore a few of these topics further since my only hands-on experience with GitOps tools was after the ArgoCD workshop on the first day of the conference. After hearing how people manage their IaC, I gained some perspective and started to compare how we do things at my current workplace.
Current workflow
A typical approach in my workplace is to build the Docker image with the dev dependencies, push it to the registry, run tests on the code, build the Docker image without the dev dependencies, push it to the registry, run some more tests, and then trigger the deployment by updating the task definition in ECS or deploying the new Docker Swarm stack with the new image.
I didn’t question this approach - it works. But at some point, I asked - where do I check what version of the image is running in the cluster? The answer was that I have to check the task definition in ECS or inspect the running container in the cluster.
Challenges
In the Terraform code, we have the version of the image that was deployed at some point in the past, but the ignore_changes
is set on the image tag, so it doesn’t get updated when the image tag changes because the task definition is updated during the deployment.
What if something failed and the whole EC2 instance with Docker Swarm goes down or the ECS cluster is deleted by mistake? How do I recover the state of the cluster? I would have to check the Actions tab in the GitHub repository and see what was the last commit that triggered the deployment. This feels like a lot of manual work and a lot of places to check for the state of the cluster.
This idea of triggering the deployment by finishing the CI pipeline is called the push-based approach.
Push-based approach
graph LR subgraph CI/CD A[Push to git repo] -->|Triggers| B[Build] B -->|Success| C[Test] C -->|Success| D[Push to image registry] D --> E[Trigger deployment with new image] end
The push-based approach offers flexibility as there is no need to run GitOps agents in the Kubernetes cluster. It also provides deployment versatility, allowing you to deploy anywhere as long as the deployment can access the Docker registry. However, it requires manual checks to verify the state of the cluster, making state management cumbersome. In case of failure, recovering the state involves multiple steps and manual interventions, adding complexity to the recovery process.
Pull-based approach
graph LR subgraph CI A[Push to git repo] -->|Triggers| B[Build] B -->|Success| C[Test] C -->|Success| D[Push to image registry] D --> E[Set new image in git repo] end subgraph CD F[Check for changes] -->|Changes present| G[Apply changes] G -->|State changed| F F -->|No changes present| F end F <-.-> E
On the other hand, the pull-based approach ensures that the Git repository serves as a single source of truth for cluster state and image versions. This method provides easy access to the history of changes and allows for straightforward rollback to previous versions if needed. Despite these advantages, there are potential bandwidth issues if there are frequent deployments and numerous changes in the Git repository. Additionally, this approach requires running GitOps agents within the Kubernetes cluster, adding to the operational overhead.
Image constraints
What if I want to configure some constraints on the deployment, like only deploying the image if the version matches ~5.3.x
? This is where Argo CD Image Updater comes into play. It allows you to define constraints on the image version and automatically update the deployment when a new image is available.
After you install the necessary manifests, you simply specify the constraints in the metadata of the Argo’s Application resource:
metadata:
annotations:
argocd-image-updater.argoproj.io/image-list: 'repository/image-name:~x.y.z'
This way, you can ensure that only images matching the specified version pattern are deployed, providing an additional layer of control over your deployments.
By default, only the image changes and the change of an image is not recorded in the Git repository. This can be a problem if you want to track the changes in the image version.
Today, this tool supports two methods of updating: directly modifying the Application resource or creating a git commit with the updated image version. The latter is the recommended approach as it provides a clear history of changes and allows for easy rollback if needed.
Conclusion
The choice between push and pull deployment models depends on your organization’s requirements, infrastructure complexity, and operational preferences. If you value flexibility and ease of deployment, the push-based approach may be suitable for your needs. However, if you prioritize consistency, version control, and automated state management, the pull-based approach offers a more robust solution.
Choosing between the push-based and pull-based deployment approaches hinges on your organization’s specific needs and constraints. As with any technology decision, there is no one-size-fits-all solution. Assess your infrastructure, team capabilities, and operational requirements to determine the best approach for your deployments.