Our client is a pioneering technology company and is currently developing an advanced version of the DATA CENTER PREDICTIVE ANALYTICS (DCPA) product. The initial version of DCPA has successfully integrated with local data centers, predicting potential failures at the rack level using Machine Learning solutions. Committed to continuous improvement, the company is actively seeking a skilled Diagnostic Engineer to join the team and contribute significantly to the enhancement of diagnostic tools.
- Develop sophisticated diagnostic software tools for identifying and resolving hardware issues in server systems.
- Design automated testing scripts and programs to assess the performance and reliability of server components.
- Collaborate closely with hardware engineers to gain an in-depth understanding of various server architectures and devise software solutions for hardware testing and diagnostics.
- Debug and refine diagnostic software to enhance accuracy and efficiency.
- Implement and maintain diagnostic software across various platforms and operating systems using Open-Source tools or vendor-specific tools where applicable.
- Utilize data analytics and machine learning algorithms to predict failures and optimize server maintenance.
- Collaborate with various IT organizations and their support teams to integrate diagnostic software into broader system health monitoring frameworks.
- Proficient in programming languages such as C# and Python (Develop scripts (shell / power shell) and applications (C#/Python) which will help to identify, analyze and resolve hardware issues in server systems).
- Strong experience in software development, with a focus on diagnostic tools or related applications.
- Understanding of server hardware and the interaction between software and physical components.
- Experience with development and debugging, version control systems (e.g., Git), and continuous integration/continuous deployment (CI/CD) pipelines.
- Knowledge of machine learning techniques for predictive diagnostics is highly desirable.
- Strong understanding of Cloud delivery architecture or Automated data delivery to Azure or similar.
- Ability to use or learn Redfish for accessing server and systems telemetry data as required.
Working Conditions and Benefits
- Paid vacation, sick leave (without sickness list).
- Official state holidays — 11 days considered public holidays.
- Professional growth while attending challenging projects and the possibility to switch your role, master new technologies and skills with company support.
- Flexible working schedule: 8 hours per day, 40 hours per week.
- Personal Career Development Plan (CDP).
- Employee support program (Discount, Care, Heals, Legal compensation).
- Paid external training, conferences, and professional certification that meets the company’s business goals.
- Internal workshops & seminars.
- Corporate library (Paper/E-books) and internal English classes.
The ideal candidate should have experience in analyzing logs and telemetry data from different servers, CPUs, GPUs, storages, etc. They should be capable of identifying issues or potential problems with hardware based on errors found in logs and telemetry. The candidate should provide suggestions or solutions on how to address hardware issues effectively and have experience working with server hardware (diagnostic, troubleshooting, etc.). Developing scripts (shell/power shell) and applications (C#/Python) to identify, analyze, and resolve hardware issues in server systems is a key aspect of this role.
Some daily overlap with Redmond (PST) timezone business hours — the usual 1-3 hours overlap in PST morning will be sufficient.
In summary, we are looking for a developer to join our dynamic team, contributing expertise to advance diagnostic tools and play a pivotal role in shaping the future of data center technology. If you are passionate about innovation and problem-solving, your application is welcome.
career at Akvelon
Get updated by subscribing to our newsletter
Grow your network and learn more about Akvelon