
Website J.P. Morgan
Job Description:
As a Site Reliability Engineer (SRE), you’ll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure, and reducing work through automation. You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you’ll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you’ll be focused on running better production applications and systems.
The SRE team runs, maintains and improves the Big Data Platform against established Service Level Objectives by applying software engineering practices. It is responsible for the availability, performance, change management, monitoring, and capacity management of their services, with special emphasis being placed on the automation of the processes/workload in support of the above. The SRE team is also responsible for the operational support of the Big Data infrastructure, with emphasis being placed on the ability to submit outage/issue/incident data into a design and SDLC feedback loop to ensure maximum automation and outage avoidance.
Job Responsibilities:
- Coach or manage teams as applicable
- Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
- Design, code, test, and deliver software to automate manual operational work
- Identify application patterns and analytics in support of better service level objectives
- Design automated software and product upgrades, change management, and release management solutions
- Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
- Design self-healing and resiliency patterns
- Participate in the 24×7 support coverage as needed
Job Requirements:
- Expertise in at least one technology stack designing, coding, testing, and delivering software
- Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm
- Knowledge of Unix/Linux administration, Unix scripts and platform level orchestration scripting.
- Hands on experience using large scale software development, preferably in one of these languages: Java, Python, scripting languages
- Bachelor’s degree or equivalent experience in an software engineering discipline
- Working knowledge of infrastructure components (e.g. routers, load balancers, cloud products, container systems, compute, storage, and networks)
- Excellent debugging and trouble shooting skills
- Should be knowledgeable about automating the build and deployment process.
- Knowledge/experience in Hadoop environment administration, release deployments to HBase, supervising Hadoop jobs, performing cluster coordination services will be preferable
Job Details:
Company: J.P. Morgan
Vacancy Type: Full Time
Job Location: Houston, TX, US
Application Deadline: N/A
vacancyoptions.com