Engineering Platforms Haven’t Failed. We Haven’t Fully Committed to Them

It's not that engineering platforms have been tried and found wanting

It's that they've been found complex, and not truly tried.

We often dismiss DevOps or platform initiatives as "not working", when in reality, they require patience, design maturity, and cultural adoption.

The challenge is not the technology itself, it is our commitment to doing the hard, systematic work it takes to make it succeed.

Glitch or Outage?

Do you find it difficult to understand if that spike on your service was actually a glitch or an outage?

This is what I think:

  • Less than a minute: Often considered a glitch, especially if it self-corrects quickly and has minimal impact on users.
  • 1-5 minutes: This could be a glitch or an outage depending on the factors like criticality of service, regulation etc. [To be defined]
  • More than 5 minutes: This is more likely considered a definite outage.

Now, what about a series of glitches? :P

What is Platform Engineering?

I like to summarise it as follows...

  1. Treat your development teams as clients
  2. Abstract environment and tooling complexity
  3. Enable self-service in a cost-effective, secure manner that is manageable at scale

What I suggest you read...

My suggested reading list:

  • Accelerate: A must-read on building high-performing teams.
  • Measure What Matters: The core of OKRs and how to implement them effectively.
  • The Lean Startup: A guide to building the right product.
  • Software Architecture: The Hard Parts: A solid reference for tackling architectural challenges.

Shariq Mustaquim’s books on Goodreads:

https://www.goodreads.com/review/list/128132142-shariq-mustaquim?shelf=%23ALL%23

Agile Workflow: Split Known Bottlenecks into “Ready for” and “Doing” Columns

What:

Separate stages where work waits vs. actively progresses (e.g., "Ready for Development" and "Development").

Why:

Improves WIP limit management (e.g., limit "Development" to 3 tasks, but allow unlimited "Ready for Development").

Reveals wait-time bottlenecks in reports.

How to Implement:

Edit your workflow → Add statuses like:

  • ✅ Ready for Review
  • 🔄 In Review

Set WIP limits in Jira’s board settings.

Example Workflow:

Backlog → Ready for Dev → Development → Ready for QA → Testing → Done

Common Pitfall:

Avoid over-splitting (e.g., "Ready for Dev," "Almost Ready for Dev"). Stick to one "Ready" queue per stage.

Blocked status or Flag item

I have seen many teams introduce a "Blocked" status in their workflow, appearing as a column on the board to highlight something is blocked. I think there is a better alternative: Use the Jira Flag

What:

Highlight externally blocked items (e.g., waiting on a vendor) using Jira’s built-in flag.

Avoids creating a dedicated "Blocked" column in your workflow.

Why:

Preserves workflow clarity by showing where the blockage occurred (e.g., "Development" vs. "Testing").

Compatible with metrics tools like ActionableAgile.

How to Implement:

Right-click an issue → Select Flag (or use the Alt + S shortcut).

Add a comment explaining the external blocker.

Pro Tip:

No flag needed for internal blockers (e.g., team dependencies). These will naturally reflect in cycle time metrics.

Using helm-secrets

Helm secrets is a great plugin to avoid checking in secrets in your Source code.

Here, I am using Hashicorp vault to store secrets and retrieve them safely in helm values files while installing helm charts.

Installation

$ helm plugin install https://github.com/jkroepke/helm-secrets

Setup

  $ export VAULT_TOKEN="s.VAULT_TOKENEXAMPLEASLDKASKDASDA" 
  $ export VAULT_ADDR="https://vault.example.com" 
  $ export HELM_SECRETS_DRIVER=vault

In vault, add the secrets:

In your helm values file, refer to the secret as follows:

db:
  db:
  database:     !vault secret/misp#db_database
  username:     !vault secret/misp#db_username
  pasword:      !vault secret/misp#db_password
  rootpasword:  !vault secret/misp#db_rootpasword

Now change the helm upgrade command as follows:

$ helm secrets upgrade misp ./helm/misp --install --wait --atomic  --namespace=misp --create-namespace  --values=./helm/misp/values.yaml

The secrets plugin will fetch and update the vault references in values file before invoking the upgrade command on helm.

Note:

To check the result of decoding, you can use:

$ helm secrets dec helm/misp/values.yaml

This will result in vaules.yaml.dec with actual decoded values from Hasicorp Vault.

How to update Vault with ADCS issued Intermediate Cert Authority

Start Vault server

vault server -dev
export VAULT_ADDR='https://vault.example.com'
export VAULT_TOKEN="s.q3M0FGIdtVu60hLJnwrU1JC2"
export VAULT_SKIP_VERIFY=1
vault status

Enable Engine

vault secrets enable -path=pki_intermediate_ca_core pki
vault secrets tune -max-lease-ttl=87600h pki_intermediate_ca_core # 10 Years

Generate CSR

vault write pki_intermediate_ca_core/intermediate/generate/internal common_name="Example Company" ttl=87600h country="United Arab Emirates" locality="Dubai" organization="Example Company" ou="Technology Department"

 -OR-

vault write -format=json pki_intermediate_ca_core/intermediate/generate/internal common_name="Example Company" ttl=87600h country="United Arab Emirates" locality="Dubai" organization="Example Company" ou="Technology Department" | jq -r '.data.csr' > pki_intermediate.csr

Key    Value

---    -----

csr    -----BEGIN CERTIFICATE REQUEST-----
MIICWTCCAUECAQAwFDESMBAGA1UEAxMJR2FuZCBCYW5rMIIBIjANBgkqhkiG9w0B
AQEFAAOCAQ8AMIIBCgKCAQEA7e1qj67LeZCnDPKa+14YCWp8XtbG4soRs544lIJW
YipBB5eCaiRazfA5kxWUv3fOklP7/pjCkeCNhryjS5DB1GK1EdgZNFpS8odqxXwY
t4CPECGVRzSK4Cce4OKBXFMKRuTuKgWH9i9Nt+eGaxD2gOkGTruuWyTiLUnr6/mx
PyenoHzMqyeUifTv0M651KUztqPJPvSz0SSO4+jpEIrGPNYEIET1Ce/1Opkf0kCq
vtCOFzIcVqzq/bYUjtkBvKgg7kyUG/EXAMPLEKJLyVA2ij3wC5LXD2Z8OMcr
iGeSqmrOKAeAJeOwfnULIhdsXABXouWlQwi+yhS5cS7QAQIDAQABoAAwDQYJKoZI
hvcNAQELBQADggEBAMPAABu9I+ezwm//CjDiIPhhQQQSsgmXPR9SdQMDkM94hGOQ
WkWFL66RDBZp/kC+OwNDC1lj7hPLGzhhZCQY3xtzcCVhRS8C1LZYiKlZ5HyY+9GG
KwBrOsBVNTyiLTDkpuGNhmUfJbIoM2fLbKoTQ7lWjaH+Ryyd7Ud8eB6L5FLXPpQm
QjdnhXqtQ7Z1u8Q66UzR7wXHhKTZn0ZBxS0C2m85pwgVdVQepL8KyGMx6zRAveyJ
wcZ4L+Ni7op7fO6nb78cfnMSE6Ja5X0KgIU0VPbVbwFAACHkNA9fP5DNvfa5DCWq
7RxQqJ7sQflnVulQ4qUnN1Y1seqFl8W36G3V8uM=
-----END CERTIFICATE REQUEST-----

* Signed from ADCS (x509 Base64 format = PEM)

Note: Run following on ADCS:

    > certreq -submit -attrib "certificatetemplate:SubCA"

-----BEGIN CERTIFICATE-----

MIIE+zCCAuOgAwIBAgITUAAAAARrstY4ahm6yQAAAAAABDANBgkqhkiG9w0BAQsF
ADAcMRowGAYDVQQDExFHYW5kIEJhbmsgUm9vdCBDQTAeFw0yMjA0MDIwMTQ5MTJa
Fw0yMzA0MDIwMTU5MTJaMBQxEjAQBgNVBAMTCUdhbmQgQmFuazCCASIwDQYJKoZI
hvcNAQEBBQADggEPADCCAQoCggEBAO3tao+uy3mQpwzymvteGAlqfF7WxuLKEbOe
OJSCVmIqQQeXgmokWs3wOZMVlL93zpJT+/6YwpHgjYa8o0uQwdRitRHYGTRaUvKH
asV8GLeAjxAhlUc0iuAnHuDigVxTCkbk7ioFh/YvTbfnhmsQ9oDpBk67rlsk4i1J
6+v5sT8np6B8zKsnlIn079DOudSlM7ajyT70s9EkjuPo6RCKxjzWBCBE9Qnv9TqZ
H9JAqr7QjhcyHFas6v22FI7ZAbyoIO5MlBv2qEANd2Eq/8iiS8lQNoo98AuS1w9m
fDjHK4hnkqpqzigHgCXjsH51CyIXbFwAV6LlpUMIvsoUuXEu0AECAwEAAaOCATww
ggE4MB0GA1UdDgQWBBR6lWYqP/8bOMicwdXJFJ7AnqIUuzAfBgNVHSMEGDAWgBQF
vqWonZOoah+8S0tku+yRQYOkyTBQBgNVHR8ESTBHMEWgQ6BBhj9maWxlOi8vLy9X
SU4tMUlCQzRJRjlKVUYvQ2VydEVucm9sbC9HYW5kJTIwQmFuayUyMFJvb3QlMjBD
QS5jcmwwawYIKwYBBQUHAQEEXzBdMFsGCCsGAQUFBzAChk9maWxlOi8vLy9XSU4t
MUlCQzRJRjlKVUYvQ2VydEVucm9sbC9XSU4tMUlCQzRJRjlKVUZfR2FuZCUyMEJh
bmslMjBSb290JTIwQ0EuY3J0MBkGCSsGAQQBgjcUAgQMHgoAUwB1AGIAQwBBMA8G
A1UdEwEB/wQFMAMBAf8wCwYDVR0PBAQDAgGGMA0GCSqGSIb3DQEBCwUAA4ICAQA4
3TviPyTXM6H+G3WCzdNMhcjauoEXAMPLEGI8JfdBsZayeEtw0ZHLbiEWDvylX
CN5FOoKImfcUNDXMzQY9PiokGKo69WtIUx5V+AZwMxDFoW9tkvrtO5AVtHJLlL5l
MDqD92dDAnojHGn8BDjlrVIxvMomMRXi5p6sksSwDijgnpIJtiml+Ss5nyI7JjID
X2x5fvhRP2kqQHisdpCWyz+l8jqj3dCFsECSHkoGJjhkj8sywJK2kK9h5sXMyj0K
VNRLLli1BaWFYk0++GVK72CnzaTBXw389Pv1a+B3yOYzd+QEoprSs7RUajHPbRmF
iepFIISHGdWrtWxH9W+9R4iWWHzQ7fUNAFjtVBo7inTEtlHYH+EFCv3sgnWW+mkr
AVU/dZV+XsLzBhbd0tm21cn3hWaGMujxswGNHvKw2uvo5KL277VKrgDwEWTIMx/L
LkqCEg23eN7n5oefbULFhJVn9RBFvvjdDs2q81mp/LgeXkJVesdR1Fe83TzXRiR9
gBLRw6RRqWWuRibsGhMl16LthQMFRBbucnBwQfLCxKdV9mv+s5nUrTbUBXSQDFcE
Dyk0Z/BmEPtiRWcQzyzYR4TwWLO3ejPexfZz1rAZdfZMKSuYnz0LqXQ6l2Kjs7b3
nbjn1W7s0CSzE4HomHwKRCqlBJUb/XapqilsQ5kTpQ==
-----END CERTIFICATE-----

Set Intermediate CA in Vault

cat From_ADCS_signed_certificate.pem > full_chain.pem
cat ADCS_root.pem >> full_chain.pem
vault write pki_intermediate_ca_core/intermediate/set-signed certificate=@full_chain.pem 

// Following is to Fix CRLs

vault write pki_intermediate_ca_core/config/urls \
            issuing_certificates="https://vault.example.com/v1/pki/ca" \
            crl_distribution_points="https://vault.example.com/v1/pki/crl"

Test

# Issue a cert:

vault write pki_intermediate_ca_core/roles/generic_server_cert allowed_domains="example.com" max_ttl="43830h" allow_subdomains=true #5 years
vault write pki_intermediate_ca_core/issue/generic_server_cert common_name="testserver01.example.com" ttl="24h" > testserver01

Ref:

[1] https://www.vaultproject.io/api-docs/secret/pki#set-signed-intermediate
[2] https://www.vaultproject.io/docs/secrets/pki