Skip to main content

Site Reliability Engineer

a person looking at a tablet
Apply Now
Share

Description

The Site Reliability Engineer is responsible for keeping all member-facing and internal production systems running smoothly. As an SRE engineer you will work with multiple teams to encourage SRE principles, maintain the availability and reliability of systems, establish SLIs/SLO’s, and develop tools and monitoring for operational visibility. SRE engineers are members of the scrum teams, and work closely with quality and software engineers to support services prior to general availability through activities such as launch reviews, reviewing performance and validating logging in dev environments. Responsible for ensuring quality releases to production environments. The SRE engineer participates in an on-call rotation, working with internal and vendor teams to manage, troubleshoot and resolve production issues.

LOCATION

Mountain America Center - Hybrid:

9800 S Monroe St
Sandy, UT 84070

SCHEDULE

Full Time

To be effective, an individual must be able to perform each job duty successfully.

  • Keep current with emerging testing techniques and technologies, as well as emerging development practices.
  • Assist in diagnosing, finding the root cause, reporting, and tracking production and non-production issues.
  • Continually researching new ways of improving and scaling systems and services.
  • Lead initiatives to improve the reliability, scalability and availability of production applications.
  • Build out tools, platform and processes to enable these goals.
  • Lead and contribute to design, develop, and improve SRE practices and procedures.
  • Create and maintain health dashboards, identifying and measuring health indicators, SLI’s/SLO’s and providing tools for operational visibility of production systems.
  • Participate in and contribute to improving our incident response acting as an escalation point for production incidents.
  • Perform root cause analysis (RCA), troubleshoot, and debug issues across our applications and services to identify and fix root cause.
  • Enhance and maintain the software release procedures and processes.
  • A strong desire and aptitude for system automation to eliminate manual work with day-to-day operations
  • Skilled with application monitoring practices and tools (NewRelic, Azure Monitor, DataDog, Splunk, etc.)
  • Understanding of and experience with SRE and DevOps principles. Demonstrated experience working in Agile teams leveraging Scrum, Kanban, or other methodologies and/or understanding of Agile development concepts.
  • Meets the needs of the end user in a quality, consistent, and professional manner, using independent judgment where appropriate.
  • Mentors less experienced engineers.
  • Excellent communication skills (verbal and written) are critical, along with exceptional problem-solving skills, and exceptionally professional behavior when interacting and responding with other technical teams throughout the organization.
  • Take part in an on-call rotation.
  • Performs additional duties and responsibilities as assigned.

KNOWLEDGE, SKILLS, & ABILITIES

The requirements listed are representative of the knowledge, skills, and/or abilities required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential job functions.

EXPERIENCE

  • Minimum 4 years of professional experience in site reliability engineering, software development, or systems administration
  • Experience monitoring or troubleshooting web applications
  • Experience with Scrum and associated tools such as Azure DevOps or Jira
  • Experience with some of the following tool sets:
    • Application monitoring tools (New Relic, DataDog, Splunk, etc.)
    • Automation tools (Pega, Microsoft Power Platform, Logic Apps, etc.)
    • API tools (Rest#, Postman, Swagger, etc.)
    • Front end tools (Selenium, Page Object Model, etc.)
    • Backend tools (SQL Server, Entity Framework, Dapper, etc.)
    • Build tools (Node, Docker, Azure Pipelines, etc.)
    • Infrastructure as Code (Terraform, Ansible, Chef, etc.)
  • Experience with automating, monitoring, and\or alerting on some of the following:
    • Web applications in Angular and React
    • Internal support tools
    • 3rd party integrations
    • Database and API connections (Rest and SOAP)
    • Cloud Solutions (AWS, Azure, or others)
  • Experience working in an agile CI/CD or rapid software testing environment.
  • Experience understanding of Git and source control concepts.

EDUCATION

Education must be from an accredited institution. Education will be verified.

  • Bachelor’s Degree in computer science, computer information systems, management information systems, or related technical field, or equivalent experience.

MANAGERIAL RESPONSIBILITY

Has no supervisory/managerial responsibilities. May provide coaching and/or mentoring to others on the team.

OTHER SKILLS & ABILITIES

  • Demonstrated proficient skills with Microsoft Office Suite including Outlook, Word, PowerPoint, and Excel.
  • Ability to work both autonomously and collaboratively in a fast-paced environment.
  • Self-starter with strong organizing and time management skills.
  • Adaptive to change, responds positively to altered circumstances or conditions.
  • Possess a desire and willingness to learn and continually update knowledge base on financial concepts, strategies, systems etc.
  • Take initiative to be a problem solver and provide suggestions to improve processes and efficiencies.
  • Excellent interpersonal skills including the ability to collaborate with other teams as needed.
  • Data analytics and data validation skills.
  • Demonstrated ability to clearly express ideas, methodology, results, and recommendations verbally, in writing and through insightful reports and graphic illustrations.

PHYSICAL ABILITIES / WORKING CONDITIONS

  • Ability to sit, talk and hear consistently
  • Ability to stand, walk, and use hands to handle or reach occasionally
  • Close vision (clear vision at 20 inches or less)
  • Distance vision (clear vision at 20 feet or more)
  • Ability to lift up to 25 pounds occasionally may need to lift up to 50 pounds.

ENVIRONMENTAL

There are no unusual environmental factors. Work is conducted in typical office setting with moderate noise (e.g., business office with computers and printers, light traffic).

#LI-FB1

a group of people putting their hands together

Join our talent community

Create your profile to connect with our talent acquisition team. You will receive occasional emails about career opportunities that match your interest and skills.

Join now