I am a government professional with over a decade of data product consultant, leadership, project management, editing/writing and patent examination experience. I graduated in May 2018 with a Master’s of Science in Data Science from Indiana University. This blog covers a wide range of data topics presented with primarily a non-technical audience in mind.

Laura H. Kahn

laurakahn2@gmail.com | @LauraHKahn  | GitHub


Data Scientist with 3 years experience using data mining from heterogeneous sources, predictive modeling and machine learning algorithms to solve challenging problems. Public servant with 11 years experience at America’s Innovation Agency including roles in data product management, prescriptive analytics, supervisory leadership and technical communication.


Indiana University, M.S. Data Science, January 2016 – May 2018
North Carolina State University, B.S. Textile Engineering and B.A. Spanish, August 1999 – May 2004

Universidad de Santander – Study Abroad program for Spanish Language, September – December 2003

Data Science Projects

Georeferenced Tweet Location Prediction with NLP (supervised classification)

  • Linked keywords in 1.32 million Tweets within a location of 10 km using logistic regression and kNN with accuracy improvement of up to 40% and log loss of 1.53 over baseline
  • Paper presented at the 2018 SMAP Conference

Use of Artificial Neural Networks for Predicting Poverty (supervised classification)

  • Dimensionality reduction (Pearson’s correlation) from 300 to 5 features to predict poverty
  • Neural network models had F1 scores of 0.85 (World Bank competition) and 0.99 (Kaggle competition)

Algorithmic Trading of Coffee Futures with Machine Learning (regression)

  • Predicted daily coffee futures closing prices time-series data with autoregressive models
  • Models had maximum percent prediction error of 0.00328.

Morning Joe: Quantifying the Relationship Between Coffee Rust, Production and Futures

  • Provided the first known quantitative framework for describing the relationship between weather, coffee rust, production amount and futures prices in Brasil, Colombia and New Caledonia.

Prediction of Brasilian Coffee Rust using Machine Learning (unsupervised classification)

  • Applied neural networks to predict coffee rust amount at the country level with a MSE of 56.86


  • Big Data Acquisition: Python BeautifulSoup and PySpark (distributed computing) libraries
  • Data Cleaning: Python Pandas
  • Data Modeling: Python PyModels library
  • Data Mining: Python SciPy, Numpy, SciKit-Learn, SQLAlchemy libraries
  • Jupyter integrated development framework
  • Natural Language Processing (English and Spanish): Python NLTK library
  • Machine Learning: Feature Engineering and Deep Learning (Python Keras, Apache Spark ML library)
  • Data Visualization: Tableau, Python Matplot, Plotly and Bokeh libraries
  • Univariate Statistical Analysis: Python Statsmodels library
  • Network Theory: Neo4j, Social Network Theory
  • Geospatial Analysis: QGIS
  • Product Management, Communication, Leadership, Time Management, Problem Solving, Customer Service

Government Experience

United States Patent and Trademark Office, Information Management Specialist (October 2010 – present)

  • Reported descriptive analytics from two bulk data systems to a wide range of audiences
  • Managed customer access to data products by collaborating with systems operations staff
  • Translated customer needs into technical requirements for Agency’s first API project in an Agile environment
  • Earned Bronze Medal Award (2017) for five years of consecutive outstanding performance

United States Patent and Trademark Office, Supervisory Library Technician (June 2007- October 2010)

  • Led a team of library assistants in customer service tasks
  • Assisted members of the public with formulating intellectual property searches

United States Patent and Trademark Office, Patent Examiner (September 2004 – June 2007)

  • Conducted technology research for 150+ patent applications and communicated legal findings to customers
  • Received highest performance rating based on work quality and timeliness metrics


Kahn, Laura H. “Spatiotemporal twitter analysis of the Venezuelan food crisis.” Journal of Food Processing Technology, 8:5. (2017): 51. Proceedings from the 2nd International Conference on Food Security and Sustainability. https://www.omicsonline.org/conference-proceedings/2157-7110-C1-062-011.pdf

Kahn, Laura H. “Chévere! Text-Based Twitter Patterns from Venezuelan Food Shortages.” 2018 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), Zaragoza, 2018, pp. XX-XX. (https://ieeexplore.ieee.org/document/8501870 )

Kahn, Laura H. “Quantitative Framework for Coffee Leaf Rust (Hemileia Vastatrix), Production and Futures”. Manuscript submitted to the International Journal of Agriculture Extension. 29 November 2018.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.