The Client is a global company that develops its own SaaS platform to service its customers of different sizes: from mid-sized to large enterprises. The platform has a significant code base that has been developed for 6-7 years. The market is extremely competitive and requires tough business decisions regarding the way and priorities for the development process.
The Client has managed to onboard an important new customer and has contracted to provide a certain level of Service-Level Agreement (SLA) for key features required for the customer’s business processes. As a result, it is necessary to have a simple visual way how to report those SLA metrics both to the customer and to the client management team.
The challenges by names:
- The existing SaaS platform’s metrics do not cover those SLAs
- Deep code refactoring to create those SLAs is not an option
- Visual dashboards to be created so that customer and client management teams can use a self-service portal to control them
- SaaS means that the Dashboards be auto-provisioned as part of the standard delivery process
- Proper security role-based policies to be applied
- successfully delivered Proof-of-Concept (PoC) phase to demo the potential solutions
- successfully delivered Minimum Viable Product (MVP) so that Client’s team can follow the trajectory
The overall project timeline:
- PoC phase – 3 weeks
- MVP phase – 2 months
- KPI and SLA dashboard deployment has been integrated into the standard delivery process
- A flexible standard mechanism how to create new dashboards has been created
- The client team has been trained to use the approach and be able to extend it
SUCCESS STORY IN DETAILS
Instrumenting the platform with proper metrics is part of Non-Functional Requirements (NFR). It is a common challenge for products emerging within highly competitive environments to satisfy NFRs because the focus is usually on delivering more functional features (Functional Requirements) rather than spending time on NFRs. As a result, when the business comes with the request to implement metrics that could shed light on business performance of the features, introduce a couple of new performance indicators (PI) and finally build dashboards for Key Performance Indicators (KPI) to be promoted into Service-Level Agreement (SLA) metrics – that could be the challenge.
- Metrics: whether to build proper once as part of fundamental data structures or to have some indirect control over flow?
- Metrics management and storage: whether to build a proprietary solution best fitting for the purpose or to integrate a 3rd party tool?
- Dashboard management: similar to metrics itself
Taking into account all considerations and requirements we have proposed the following:
- To integrate 3rd party Application Performance Monitoring (APM) tool
- To build the metrics on top of logs generated by the system
- Research and evaluation of several tools to pick the best fitting one
Just to name a few: ELK, New Relic, AWS Cloud Watch, Azure Application Insights
- Research and evaluation of existing logging practices within the platform and whether they are enough to build required metrics
- The plan required code improvements and coordination with the client’s team
As a result of tight cooperation with the client’s team, we have elaborated a joint plan which has been incorporated into the Product’s Release plan.
What has been done:
- Logging mechanisms have been improved to produce necessary data for metrics
- Logging has been structured to simplify parsing for metric’s value extraction
- Designed dashboards per user role base
- Dashboard deployment has been integrated into an existing automated deployment pipeline
- Education and training sessions for the Client’s dev teams
- Modern APM tools allow the building of powerful monitoring solutions based on standard application logs, with no need to create specialized performance counters or similar mechanisms
- Log messages to be structured so the parsing is quick and no performance issues with dashboards
- Log-based monitoring has accuracy limitations, and must be carefully reviewed in regard to your specific domain and PI to be generated