Using AWS Systems Manager to Tag and Patch On-Premises Servers

Within my role, I’m responsible for a global fleet of 5000+ on-premises Window Servers that will shortly have their sole remote management capabilities completely managed by AWS Systems Manager. We will be leveraging AWS Patch Manager and AWS Systems Manager Maintenance Windows to effectively maintain and patch our entire fleet. The initial mechanism we will be implementing here is a tagging mechanism in order to be able to target the resources using a Systems Manager Association to call an Automation Document

Utilizing AWS Cloud services (EventBridge, SNS, SQS and Lambda), all EC2 and AWS::SSM::ManagedInstance resources will be tagged with a ResourceType key/value tag.  This will enable the implementation of the following:

  • Resource Targeting – Tags such as ResourceType can be applied to distinguish between the two main use cases (EC2Instance vs ManagedInstance) and used as a target within Resource Groups
  • Managing On-Prem Agents – Use of the SSM Agent will become commonplace as resources are removed from CorpNet. Efficient grouping of the On-Prem resources will enable specific management of these hosts compared to EC2
  • Configuration Drift – Recurring cycles will collect and correct any changes outside the definition set by the configuration
  • Software Upgrades – Software can be targeted to specific use cases and released to newly registered agents as quickly as possible
  • Patching – Specific SSM maintenance windows can be applied to restrict reboots to within these Maintenance Windows
  • Automation – The system will manage itself only requiring human intervention if faults occur
  • Scalability – The tagging mechanism will recognise newly added agents
  • Efficiency – Lightweight & event driven.  Local ground teams will be 99% hands off with this approach. It is impossible to achieve 100% as teams may still be required to execute some workflows at site.
  • Reusability – All services will make use of queuing and subscriptions to ensure the underlying mechanisms are reusable by other services

The applied ResourceType tag will then be used in future solutions as the target for a Resource Group membership, for example, in Systems Manager Associations used with an Automation Document where Resource Groups are the only allowed target for ManagedInstances.
This can be refactored easily by attaching another SQS queue and paired Lambda between the SNS topic and the final Lambda.

More to follow….

Leave a comment