Tips
This article provides tips and recommendations for your optimization journey.
Start with Unused Resources
Unused Resources use case is easiest to implement and gives you understanding where the waste is. It is also the most cost effective. You should not strive to address all resources in this use case and then move to another, instead just go for the "big fish" early on.
Look for patterns
Take a good look into the output of unused resources use case, try to identify patterns, such as resource types that are having highest costs. It can tell you that for example data solutions utilizing SQL databases represent 50% of total unused resources.
By identifying patterns and groups of resources, you will be able to reduce the area of optimization to a few engineering teams within the organization. This can save a lot of time, because instead of approaching each resource individually, you might call for a meeting with XYZ team and outcome of the meeting can be large savings addressing many resources across the tenant.
It is always better to delete whole resource group with 50 resources, if possible, instead of coming back to each resource at different times and chasing many people. You can load CSV file in Excel and do some deeper exploration, such as identifying resource groups that have many identified resources - these could be some workloads worth looking into.
Delay purchasing of Azure Reservations
If you are like many companies out there, you are most likely leveraging Azure Reservations. The problem is, that you are most likely purchasing reservations for resources that are unused / over provisioned / old sku all of that with low % saving. If your reservations are about to expire, you should still start with Unused Resources use case and delay purchase of reservations, as Unused Resources use case is the one we have seen the largest savings within few weeks. After spending little time on addressing unused resources, move onto Over Provisioned and Modernizer use cases, these are directly influencing Azure Reservations.
And if you have not been using Azure Reservations
, we recommend having a look, as it can save 30 - 50% on optimized resources that support reservations (VMs, App Service Plans, Scale Sets).
If you know that there are large numbers of resources that will be deleted, purchase less reservations than you did last year ago to account for this fact
Mark resources manually
Manually Marked use case, is great tool to let application know about false positives and acknowledge that maybe this or that resource is not possible to be optimized.
Because this use case brings estimated savings to zero (application will ignore any other use cases if resource is manually marked), it is also way how to reduce your waste. Our pricing scheme is based on several factors, one of which is total detected waste, therefore you can use this use case to lower our fee. Of course we expect that only resources that are false positives or restricted by some other way, where any changes are prohibited.
Simplify process of deleting resources containing data
Compute resources that don't hold the data, such as App Service Plans, VMs (just the VM configuration, not the disks), are generally very easy to cleanup once they come to light. Same goes for non-production databases and data containers (storage accounts, disks, ...).
Problematic ones are production resources, even if they are no longer in use or just pretending to be production resources and have never been used.
There is no silver bullet for this as it poses significant risk to "just go ahead and delete them". This is causing friction for people who are trying to get rid of them, because those are relatively hard to get consent for deletion and can spin long discussions, meetings, email chains without end.
If this is not the case for your company, great, but for most this will be the biggest challenge in optimization of Azure environment.
Deleting data resources
We have seen relative success with the following approach, that simplifies the problem by reducing chatter to minimum about whether the resource that had no usage for 1 year can be deleted and save 2000 per month at the same time it provides relative safety for the person or team doing the deletion.
Tooling Scripts
First core component of this approach is automation script that does one of these:
- backup / export database to storage account, there are many databases such as PostgreSQL, MySQL, MSSQL, so there will be different scripts for each, or one that can work with different database providers
- copy files from one storage account to another
- copy VM disk to storage account
The second component is bring up script
.
This component's purpose is to bring the exported data back to its original form, so in case it is database, this bring up script will re-create the database using previously exported copy.
Archive Resource
This would be some cheap storage, in Azure you can use storage account with archive tier
.
The process
With the tooling and archive resource in place process becomes much less tedious and less stressful.
This is because these three facts:
- Now you have way how to save almost 100% of the costs of the resource and you have automation to export the data to cheap storage
- You can invest limited amount of time in getting the answer whether you can delete the resource, but if it takes too long just put a time limit before taking action without the answer
- If the resource was incorrectly deleted and it must be recreated, even if its months or even years later, you can use the
bring up script
to restore the resource, but from our experience this will be in rare cases
After you know the resource to be deleted, you can set a time limit on how long to chase for the approval. Let's say you will give the owner 2 weeks. If you don't get any response or nobody in company knows anything about the resource (yes, this happens more often than you would think):
- Obtain archive storage
- Export the resource to archive storage using the tooling
- Test the restore (reverse script) from archive storage, if it’s a database, test even connection to the DB and if you see tables with data
- Consider storing the restoration script next to the actual exported data
- Delete the test of restoration
- Delete the actual source resource