Job Title: Data Analyst - AI & Threat Analytics Team - Remote
Job Overview
We are seeking a highly skilled and motivated Data Analyst to be a key member of our innovative AI & Threat Analytics team. In this pivotal role, you will play an essential part in advancing our autofill classification models by overseeing, optimizing, and analyzing data sets. This position is entirely remote, with the option for a hybrid work arrangement for candidates in the El Dorado Hills, CA, or Chicago, IL, metropolitan areas.
Key Responsibilities
?ake complete ownership of the data lifecycle including collection, cleaning, and preprocessing for HTML-based datasets utilized in machine learning applications.
?mploy web analysis tools to extract and organize data from DOM environments for effective model training and validation.
?ollaborate with machine learning engineers to facilitate feature engineering experiments, producing training datasets that align with model specifications.
?reate and enhance synthetic datasets using large language models (LLMs) to improve the balance and accessibility of training data.
?nalyze data with dimensionality reduction techniques such as t-SNE, PCA, and UMAP to evaluate feature strengths and enhance dataset quality.
?treamline data workflows through automation to facilitate efficient data processing and transformation.
?evelop and maintain comprehensive documentation for data workflows, ensuring lineage, reproducibility, and scalability.
?stablish validation and data quality systems to guarantee consistency and integrity across all datasets.
Required Skills
?inimum of 2 years of professional experience as a Data Analyst, preferably within a cybersecurity or machine learning context.
?roficiency in Python for data manipulation and analysis (utilizing libraries like Pandas and NumPy) and automation of data workflows.
?trong familiarity with web analysis tools (e.g., Selenium, BeautifulSoup) and a thorough understanding of HTML and DOM structures for data extraction.
?nowledge of natural language processing (NLP) techniques such as tokenization, stop word removal, and lemmatization.
?xperience in generating synthetic datasets and utilizing LLMs to supplement machine learning data.
?bility to collaborate effectively with machine learning engineers and technical teams.
?trong analytical and problem-solving abilities with a diligent focus on data quality and governance.
?amiliarity with cloud platforms (AWS, GCP, Azure) for data storage and processing.
?achelor? degree in Data Science, Statistics, Computer Science, or a related field, or equivalent experience.
?ll applicants must be U.S. Persons due to the role? involvement in GovCloud.
Career Growth Opportunities
This position offers substantial opportunities for professional growth within the AI and threat analytics sector, allowing you to enhance your skillset and advance your career within a leading organization.
Company Culture and Values
We are dedicated to fostering a diverse and inclusive environment that promotes collaboration and innovation, ensuring all employees feel valued and respected.
Compensation And Benefits
?omprehensive medical, dental, and vision insurance (including coverage for domestic partnerships).
?mployer-paid life insurance and employee/spouse/child supplemental life insurance.
?oluntary short/long-term disability insurance.
?01(k) plan with both Roth and traditional options available.
?enerous paid time off (PTO) plan acknowledging your commitment and seniority, which includes paid bereavement and jury duty leave.
?ompetitive annual bonuses.
We encourage diverse candidates to apply and are committed to maintaining an inclusive workplace.
Employment Type: Full-Time