Data Wrangler

Genomics England

Are you interested in working with large scale data sets and impacting the future of genomic healthcare? 

We are currently recruiting for a Data Wrangler to join us here at Genomics England!

As a Data Wrangler, you will specialise in optimising the performance and seamless movement of large data volumes using specialist tooling. You will be responsible for curating and transforming datasets, generating key statistics, and deriving new datasets tailored for diverse audiences.

This role will also include managing data workflows, developing and maintaining data pipelines, collaborating with cross-functional teams to understand data requirements, and ensuring data integrity. Additionally, you will explore new technologies and contribute to knowledge sharing across the Data Chapter. 

Key responsibilities

  • Design and build data solutions that deliver the business needs and requirements across clinical and research domains 

  • Extract transform and load data to support research and clinical practices

  • Generation and derivation of statistics and data visualisations to support data driven decision making  

  • Codifying repeatable data processes, increasing productivity and efficiencies and supporting the standards for data usage throughout Genomics England

  • Ensure data quality is at the centre of data delivery, using processes including automated routines and self-healing/monitoring 

  • Implementation and adherence to software development best practices 

  • Developing testing routines and datasets to ensure robust and consistent product delivery 

  • Developing healthcare data models and associated artifacts 

  • Managing associated data storage and management solutions ensuring the optimum architecture exists to support data solutions 

  • Working with Cloud First technologies to leverage the latest data healthcare solutions 

Example tooling

  • Cloud: AWS or equivalent Cloud experience

  • Data Processing: AWS Glue, Python, SageMaker, Prefect

  • Data models: XML, JSON, HL7, FHIR, OMOP

  • Databases: AWS S3 & Athena, AWS DynamoDB, AWS RDS, AWS Aurora (Postgres)

  • Continuous deployment: AWS Lambda, Docker, Kubernetes

  • Languages: Python, SQL

  • Visualisation software: Tableau

  • Practices: DMBOK2, Continuous Integration/Continuous Deployment (CI/CD)

Qualifications

Ideally, a Master’s degree or equivalent experience working in data management, biostatistics, clinical informatics or data analysis.

Job Alerts

Get notified when new positions matching your interests become available at {organizationName}.

Need Help?

Questions about our hiring process or want to learn more about working with us?