I'm a DevOps Architect for our on-premises multi-cloud solution. The core business of the company I work for is clearing between all the banks in my country.
We're now in the phase of implementation of major projects for Instant Payments and Open Banking API HUB (PSD2). One of our requirements from solution providers was that the solutions should be "Kubernetes agnostic", so we can deploy the solution on OpenShift, vanilla Kubernetes, etc.
At the same time, for all newly developed in-house applications, we strongly prefer to be able to be deployed on Kubernetes too.
Since we're a small company we need to be efficient and that is why we are implementing automated releases, automatic testing, and CI/CD pipelines using seamless microservices.
When I started using OpenShift, it proved to be a very resilient platform with strengthened security. My responsibility was to build up the whole on-premise private multi-cloud. We designed a practice in GitOps manner so that the source code of all the applications, including the manifests for Kubernetes clusters and the database scripts management, are stored on our private GitHub repository that is used as a single source of truth. At the same time, we realized, that we cannot store sensitive data in GitHub like credentials to other systems, certificates, API keys, etc.
To solve this, we deployed and configured another system place for the secure storage of sensitive data - HashiCorp Vault.
Since we have two data centers, we decided to do multi-cloud deployments in a manner that we don't have one OpenShift cluster stretched across different data centers, but for each data center, we have a separate deployment where both of the OpenShift clusters are connected to the same database and the same external persistent storage. The production databases are not deployed on OpenShift.
We heavily use cloud-native CI/CD pipelines. Practically, there are two cloud-native projects that are supported by RedHat. Tekton is used for Continuous Integration and ArgoCD is used for Continuous Deployment.
Red Hat developed operators for these two tools and we embedded those tools in our day-to-day work. Whenever a developer performs a specific action on GitHub, like Pull Request, separate processes automatically start through the use of webhooks. Then, that source code is built, tested, packaged into a container, pushed to our private docker registry, and deployed to various environments depending on the GitHub branch (dev, test, prod).
We created generic CI/CD pipelines for APIs and databases and by using these pipelines we're speeding up the process of day-to-day deployments, like manually copying files and assigning privileges to various environments. Now, the rest of the staff has more time for more concrete jobs rather than repetitive tasks that are also very prone to errors. We're much more efficient now.
Our developers aren't using CodeReady workspaces and they don't have isolated environments on their laptops, but they do have access to the OpenShift clusters in a very controlled manner since they're directly deploying the solution via these pipelines to the OpenShift platform.
We're currently using version 4.10. I have also used versions 3.10 and 3.11. We are testing deployment on OpenShift 4.12 too where the Kubernetes version is 1.25+ which brought a lot of major improvements.
When there are a handful of APIs and microservices, you can orchestrate them manually and use Docker Swarm. However, when there are dozens of applications consisting of hundreds of microservices, you must use tools for orchestration. That's where Kubernetes comes into play.
When I came to this company all of the deployments to various environments (dev, test, prod) were done manually.
As I said, there are two main types of deployments. One type is the deployment of databases, DB tables, new procedures, new functions, etc.
The other type is the deployment of APIs. For that deployment, my colleagues were manually building the project. They did not use any kind of SCM tool at that moment. When they got the compiled artifacts, like DLL files, they were copying those files to another team. People from that team were picking the files and manually copying them to our web servers.
For databases, it was practically the same. Developers built escrow scripts, and another team deployed the scripts and tested them on a test environment. After successful testing, the third team deployed the scripts to production.
This was a very manual, lengthy, and error-prone process where a lot of things could go wrong.
We now use our private GitHub repository for software development and version control, as well as CI/CD pipelines for both API and database deployment to each environment.
It's a very automatic and predictable process.
Deployment of a new release to the test or production environment is protected, and developers can't deploy a new version any time they want. They need to have approval from the release manager. If the release manager approves the deployment to a specific environment, the pipeline automatically picks up that information and deploys the whole solution.
This control of deployment to a specific environment (four eyes principle) is managed with policies on GitHub.
We have reports in the form of audit logs that show who did what, at what time, and in what environment and we have total control over the production environments.
We have achieved zero downtime on almost all levels, because of the way that OpenShift and the containers work. When a new container is deployed, the old one is still running. When the new one starts and all of the readiness probes work, then the old container is practically terminated. Most of the time, we don't have downtime for some of the manual work.
Using GitHub and CI/CD pipelines, we have avoided a lot of manual work and have sped up the process of deployment of new releases to various environments by at least 30%. Now, we have a more reliable process, which is automatic, and we're avoiding human-prone mistakes during this process.
OpenShift also comes with integrated monitoring. It uses internal metrics collected by Prometheus and Grafana for visualization of those metrics, but you can export all the logs and metrics from OpenShift to another system. You can also transfer the logs and metrics to Elastic Stack using filebeats, metricbeats, and auditbeats, and you can easily monitor the cluster from outside because you need to be on top of it all the time.
When everything goes down, you must intervene quickly and get information very quickly through different channels like Slack, SMS, email, Teams, or chat.