Eric Soderquist

Eric (Yunjae) Soderquist

Building @ Speakease AI | Data Science & Analytics Intern @ AGCO | Brain & Cognitive Science @ Illinois

About Me

Innovative Software Engineer, Data Scientist, and AI Specialist with a focus on:

  • Large Language Models (LLMs)
  • Cloud Infrastructure
  • Real-time AI Solutions

Proven expertise in both technical programming and strategic AI development, including:

  • Data Pipeline Orchestration
  • Dataset Curation
  • RLHF Fine-tuning
  • Prompt Engineering

Spearheaded high-impact projects, from automating AWS cloud deployments and web scraping solutions to building scalable AI-powered platforms for cross-linguistic communication.

Adept at translating complex neural architectures into real-world applications, such as BCI systems and multimodal translation models, with a passion for creating user-centric, data-driven solutions that bridge technology and human interaction.

Experience

Speakease AI logo

Founder, CEO

Speakease AIJan 2023 - Present

  • Founded Speakease AI to develop a free, AI-driven platform providing real-time voice translation across 50+ languages, addressing the need for accessible cross-linguistic communication tools.
  • Led all coding and machine learning development, creating a custom multimodal language model utilizing Reinforcement Learning from Human Feedback (RLHF) and divergence training, resulting in high-quality, context-aware translations.
  • Architected and maintained cloud-native infrastructure using TypeScript, Next.js, Supabase, and Azure; deployed models on Azure OpenAI service and secured cloud funding through a Microsoft partnership.
  • Implemented low-latency, real-time interactions by leveraging Azure's infrastructure to ensure high availability and scalability for seamless user experiences.
  • Managed end-to-end engineering, data science, and data engineering efforts, from model training and fine-tuning to infrastructure optimization, enabling translations adaptive to sociocultural and emotional contexts.
  • Achieved over 1,000+ active users, providing accessible, real-time translation support to immigrants, ESL learners, travelers, and language learners.
  • Pioneered innovations in AI translation using advanced machine learning techniques, creating a scalable, cloud-native solution that democratizes access to high-quality language tools.
Generative AI
Natural Language Processing (NLP)
Large Language Models (LLM)
Machine Learning
Neural Machine Translation (NMT)
Data Engineering
Founder, CEO project visualization
View Speakease AI's Website
AGCO Corporation logo

Data Science & Analytics Intern

AGCO CorporationJan 2024 - Present

  • Provided technical solutions to streamline operations, addressing challenges in cloud permissions management, competitive data acquisition, and warranty data processing due to large datasets incompatible with Excel.
  • Developed an AWS Permissions Automation system, transforming an 8-hour daily manual task into a 1-second automated process by creating an optimized script leveraging AWS API integrations, significantly enhancing operational productivity and cloud security management.
  • Engineered an Automated Competitive Data Acquisition platform, building a scalable, distributed web-crawling system that reduced data acquisition time from months to approximately 2 hours and 30 minutes. Overcame obstacles like IP blocks, geolocation restrictions, rate limiting, and device tracking to provide AGCO with real-time insights for data-driven decision-making.
  • Created a Warranty Data Processing Tool, designing and implementing a GUI-based internal application to compare millions of warranty serial numbers and equipment IDs. This innovation turned a two-week manual process into a 1-minute, 30-second automated solution, optimizing workflow efficiency and eliminating delays.
  • Enhanced CRM functionalities by integrating Salesforce customizations for a Yanzhou, China dealer warranty project, improving market understanding through advanced CRM insights and aiding strategic improvements in dealer support and customer relationships.
  • Optimized ETL pipelines in AWS/Azure, deploying, automating, and optimizing data pipelines with outputs stored in S3 buckets and visualized via a VistaMap frontend, providing comprehensive and up-to-date data visualizations.
  • Contributed to significant operational cost savings and efficiency gains, enabling faster, data-driven decisions and improving market coverage strategies through frequent, cost-effective updates at no additional expense.
Web Scraping
Amazon Web Services (AWS)
Automation
Data Engineering
Python (Programming Language)
Artificial Intelligence (AI)
Large Language Models (LLM)
Microsoft Azure
Java
Extract, Transform, Load (ETL)
View AGCO Corporation's Website
Scale AI logo

Data Scientist

Scale AIJan 2024 - Jun 2024

  • Contributed to the Google PaLM 2 (Bard) → Gemini transition team, aiming to dramatically improve the Gemini model's performance on key benchmarks such as MMLU and HumanEval, which assess language understanding and coding abilities.
  • Enhanced Gemini Ultra to outperform PaLM 2, focusing on significantly improving multilingual and coding task capabilities as measured by industry benchmarks.
  • Led the implementation of divergence-based RLHF (Reinforcement Learning from Human Feedback) to refine model alignment for both language and coding tasks.
  • Collaborated on fine-tuning Gemini Ultra, optimizing performance in real-world tasks including multilingual translation and complex coding challenges.
  • Conducted extensive benchmarking and optimization, specifically targeting improvements on MMLU and HumanEval.
  • Elevated MMLU performance from 78% (PaLM 2) to 90.04% with Gemini Ultra, showcasing significant advancements in language understanding.
  • Boosted HumanEval benchmark from 37.6% to 74.4%, positioning Gemini Ultra as a leading model in coding and language tasks.
Python (Programming Language)
Artificial Intelligence (AI)
Neural Machine Translation (NMT)
Generative AI
Large Language Models (LLM)
Data Science
Prompt Engineering
View Scale AI's Website
Scale AI logo

Software Engineer

Scale AIJan 2023 - Dec 2023

  • Contributed to the OpenAI ChatGPT team, focusing on improving infrastructure, data integrity, and preprocessing pipelines to ensure a smooth transition from GPT-3 to GPT-4.
  • Led the development and optimization of large-scale data preprocessing pipelines, achieving a 35% increase in data processing efficiency and accelerating GPT-4's training, making it the most advanced language model at the time.
  • Developed validation and sanitization protocols to enhance data quality, reducing model hallucinations and improving overall reliability.
  • Assisted in the systematic transition of infrastructure from GPT-3 to GPT-4, leveraging improved Azure-based AI infrastructure to handle complex, real-world tasks like reasoning and coding.
  • Integrated feedback mechanisms from Reinforcement Learning from Human Feedback (RLHF) to fine-tune GPT-4's responses, reducing harmful outputs by 82% compared to GPT-3.5 and improving factual accuracy by 40%.
  • Enhanced model safety and alignment features, ensuring GPT-4 was 82% less likely to produce disallowed content and improved performance on key benchmarks like TruthfulQA and adversarial factuality evaluations.
  • Supported GPT-4's deployment across applications, including ChatGPT Plus, API integrations, and partnerships with Duolingo and Be My Eyes, delivering safer and more useful responses to users.
Natural Language Processing (NLP)
Front-End Development
Python (Programming Language)
Artificial Intelligence (AI)
Neural Machine Translation (NMT)
Generative AI
Programming
Back-End Web Development
Large Language Models (LLM)
DevOps
Prompt Engineering
View Scale AI's Website
University of Illinois, Beckman Institute for Advanced Science and Technology, Cognitive Neuroimaging Lab logo

Research Assistant, Machine Learning & Brain Computer Interfaces

University of Illinois, Beckman Institute for Advanced Science and Technology, Cognitive Neuroimaging LabJan 2022 - Dec 2022 · 1 yr

  • Developed deep learning architectures for multi-modal neuroimaging fusion (MRI/EEG/fNIRS) to support real-time Brain-Computer Interface (BCI) applications, enhancing neural decoding accuracy for user control.
  • Optimized neural decoding algorithms to enable real-time BCI feedback, improving the responsiveness and reliability of user interactions.
  • Designed and implemented parallelized preprocessing pipelines for high-dimensional neurophysiological data, increasing efficiency and throughput while reducing computational overhead.
  • Integrated cross-platform BCI signal processing workflows, ensuring seamless interoperability between data collection platforms and visualization systems.
  • Implemented advanced spatiotemporal data visualization techniques, facilitating clear representation of complex neural dynamics in closed-loop neurofeedback systems.
  • Significantly improved neural decoding accuracy in real-time BCI applications, leading to more reliable and responsive user control.
  • Enhanced data processing efficiency through parallelized workflows, allowing for faster real-time analysis and reducing computational delays.
Brain-computer Interfaces
Python (Programming Language)
R (Programming Language)
Machine Learning
Data Visualization
Quantitative Research
Data Analysis
Optical Imaging
EEG
Programming
Neuroimaging
Data Engineering
Pattern Recognition
MRI
Data Science
Computer Science
View University of Illinois, Beckman Institute for Advanced Science and Technology, Cognitive Neuroimaging Lab's Website

Education

University logo

University of Illinois Urbana-Champaign

Bachelor of Science - BS, Brain and Cognitive Science • 2025

Focusing on:

  • Machine Learning Theory
  • Data Structures & Algorithms
  • Artificial Neural Networks
  • Generative Artificial Intelligence
    • Mixture-of-experts
    • Quantization
    • End-to-end multimodal speech + vision large language models

Projects

Speakease AI

Speakease AI

Jan 2023 - Present

A free, AI-driven platform providing real-time voice translation across 50+ languages.

  • Led all coding and machine learning development, creating a custom multimodal language model utilizing Reinforcement Learning from Human Feedback (RLHF) and divergence training, resulting in high-quality, context-aware translations.
  • Architected and maintained cloud-native infrastructure using TypeScript, Next.js, Supabase, and Azure; deployed models on Azure OpenAI service and secured cloud funding through a Microsoft partnership.
  • Implemented low-latency, real-time interactions by leveraging Azure's infrastructure to ensure high availability and scalability for seamless user experiences.
  • Managed end-to-end engineering, data science, and data engineering efforts, from model training and fine-tuning to infrastructure optimization, enabling translations adaptive to sociocultural and emotional contexts.
  • Achieved over 1,000+ active users, providing accessible, real-time translation support to immigrants, ESL learners, travelers, and language learners.
  • Pioneered innovations in AI translation using advanced machine learning techniques, creating a scalable, cloud-native solution that democratizes access to high-quality language tools.
Generative AI
Natural Language Processing (NLP)
Large Language Models (LLM)
Machine Learning
Neural Machine Translation (NMT)
Data Engineering
Speakease AI
View Project

Ericflix: A Self-Hosted Media Streaming Platform with Advanced Features (25+ MAU)

Aug 2021 - Present

Self-hosted media streaming infrastructure with advanced features:

  • Content-based recommender systems using unsupervised clustering and NLP
  • Metadata-rich UI for enhanced discoverability
  • Automated content ingestion pipelines
  • User-driven content acquisition portal
  • Adaptive bitrate streaming with client-side network analysis
  • High-throughput GPU-accelerated transcoding
  • Premium audio-visual codec integration (Dolby Atmos, Vision, HDR10+)
  • IPTV integration
  • Automated polyglot subtitle and audio track retrieval

25+ Monthly Active Users, exceeding commercial streaming quality benchmarks

Data Engineering
Machine Learning
SQL
Back-End Web Development
Scripting
Automation
Front-End Development
Database Administration
DevOps
Natural Language Processing (NLP)

Advanced Autoencoder Architectures for Unsupervised Feature Learning and Dimensionality Reduction

Oct 2023 - Dec 2023

Development of unsupervised neural network models focusing on autoencoder architectures employing backpropagation algorithms for efficient data encoding and reconstruction tasks.

  • Implemented both single-layer and two-layer autoencoders in Python, optimizing algorithmic efficiency and scalability within high-dimensional data contexts
  • Conducted in-depth exploration of non-linear dimensionality reduction techniques and feature extraction methodologies, utilizing sigmoid activation functions and gradient descent optimization
  • Engaged in rigorous hyperparameter tuning to enhance model convergence and minimize reconstruction loss, adhering to computational standards in unsupervised learning paradigms
  • Leveraged Python's scientific computing libraries including NumPy for high-performance numerical operations, Scikit-learn for machine learning utilities, and Matplotlib for data visualization
  • Contributed to advanced neural network modeling within the context of cognitive psychology and neuroscience, aligning with the academic objectives of PSYC 489: Neural Network Modeling Lab
  • Underscored the applications of autoencoders in dimensionality reduction, anomaly detection, and feature learning, enriching the discourse in unsupervised machine learning and data representation
Artificial Neural Networks
Data Engineering
Machine Learning
Unsupervised Learning
Backpropagation
Dimensionality Reduction
Computational Neuroscience
Advanced Autoencoder Architectures for Unsupervised Feature Learning and Dimensionality Reduction
View Project

Hopfield Networks for Associative Memory Modeling

Aug 2023 - Oct 2023

Construction and analysis of an advanced Hopfield Network framework to simulate associative memory and pattern retrieval in neural systems, employing recurrent neural network architectures with symmetrically weighted connections and energy minimization principles to achieve stable memory states.

  • Focused on accurate simulation of network dynamics based on the original formulations by John Hopfield, exploring the network's capacity for content-addressable memory and its applications in optimization problems
  • Implemented a class-based structure in Python for modularity and extensibility, incorporating methods for network training on multiple memory patterns, state updates with synchronous and asynchronous options, and precise energy calculations to analyze convergence behaviors
  • Leveraged NumPy for efficient numerical computations and Matplotlib for visualizing energy landscapes and network states
  • Enhanced understanding of cognitive functions and contributed to computational neuroscience studies within the context of PSYC 489: Neural Network Modeling Lab
  • Offered insights into memory models, pattern recognition, and the underlying mechanics of neural associative processes
Data Engineering
Machine Learning
Artificial Neural Networks
Recurrent Neural Networks (RNN)
Pattern Recognition
Python (Programming Language)
Hopfield Networks for Associative Memory Modeling
View Project

Sentiment Analysis on the IMDB Review Dataset Using Optimized Recurrent Neural Networks

Aug 2022 - Dec 2022

Sentiment classification on large-scale movie review corpus using optimized recurrent neural architectures.

  • Employed sequential memory modeling for temporal dependency capture in natural language
  • Performed hyperparameter tuning via grid and stochastic search methodologies for performance maximization
  • Leveraged distributed representations and transfer learning from pre-trained word embeddings
  • Implemented within TensorFlow/Keras ecosystem with auxiliary data manipulation via Pandas and Scikit-learn
  • Contributed to affective computing research in human-computer interaction and opinion mining domains
Data Engineering
Machine Learning
NLP Libraries
NLTK
Python (Programming Language)
Programming
Computer Science
Artificial Neural Networks
Data Science
Recurrent Neural Networks (RNN)
Sentiment Analysis on the IMDB Review Dataset Using Optimized Recurrent Neural Networks
View Project

A Comprehensive Analytical Framework for Multiplayer Yahtzee: From Simulation to Statistical Insights

Jan 2022 - May 2022

Stochastic simulation framework for multi-agent decision-making in constrained combinatorial environments.

  • Leveraged Monte Carlo methods for strategy optimization in dice-based games
  • Conducted statistical inference on high-dimensional gameplay metrics via hypothesis testing and multivariate regression
  • Performed data-driven insights extraction through exploratory data analysis and advanced visualization techniques
  • Implemented in Python with NumPy/Pandas for numerical computing and Matplotlib/Seaborn for graphical representation
  • Contributed to computational game theory and strategic decision analysis in uncertain, rule-bound domains
Data Engineering
Data Analysis
Python (Programming Language)
Regression Analysis
Programming
Computer Science
Statistics
Data Visualization
View Project

Skills & Expertise

Natural Language Processing (NLP)
DevOps
Neural Machine Translation (NMT)
Artificial Neural Networks
Generative AI
Python
TypeScript
Next.js
Azure
AWS
Machine Learning
Data Science
TensorFlow
Keras
SQL
Data Engineering
Unsupervised Learning
Recurrent Neural Networks (RNN)
Data Analysis
Regression Analysis
Programming
Statistics
Data Visualization
Computer Science
Brain-computer Interfaces
R (Programming Language)
Quantitative Research
Optical Imaging
EEG
Neuroimaging
Pattern Recognition
MRI

Get in Touch

Speakease AI

© 2024 Eric Soderquist. All rights reserved.