About
#
Broadly, I’m an interdisciplinary data scientist with special interests in psycholinguistics, computational linguistics, statistics, and Japanese. More specifically, I apply statistical and natural language processing techniques to solve problems and enrich my and my team’s understanding of whatever data is at hand. I bring a strong statistical background to my work, whether in the form of data handling, ad-hoc statistical analyses, or machine learning applications.
Skills / Stack
#
|
|
|
Statistics |
python, R, julia, hypothesis testing, bayesian modeling, GLM/SEM/MLM, MCMC methods |
[●●●●●●●●○○] |
Data Analysis |
python, R, pandas, numpy, tidyverse |
[●●●●●●●●○○] |
Machine Learning |
scikit-learn, tensorflow, (un)supervised learning, clustering, generative models, RAG, prompt engineering, NLP, embedding methods |
[●●●●●●●○○○] |
Data Reporting |
static & interactive reports, dashboards, data visualization, matplotlib, plotly, LookerStudio |
[●●●●●●○○○○] |
Data Management |
data cleaning, data collection, (No)SQL, document processing, BigQuery |
[●●●●●○○○○○] |
In addition to the above data science skills, I’ve also got:
|
|
|
Research |
research/experimental design, scientific writing, technical reporting |
[●●●●●●●●○○] |
Conversational Design |
Google CCAI/Dialogflow, NLU, chatbot integration |
[●●●●●●○○○○] |
Misc Programming |
go, ruby, linux, git & its myriad hosting services |
[●●●●●○○○○○] |
Japanese |
immersion, teaching, professional communication |
[●●●●●○○○○○] |
Cloud Platforms |
Google Cloud Platform, Azure |
[●●●○○○○○○○] |
Interests
#
|
|
Speech Processing |
speech intelligibility, phonology, emotional word characteristics |
Bilingualism |
language acquisition, L2 phoneme processing |
Natural Language Processing |
word & sentence embedding, unsupervised learning, document classification |
Statistical Methods |
Bayesian modeling, MCMC simulation, clustering |