Skip to main content

My recent work in AI, ML, and data science

Projects

Cloud2Cloud Harvard / NASA Capstone Project - Ongoing

  • Cloud2Cloud focuses on accurately measuring cloud-top heights to enhance the calibration and validation of satellite radiometric instruments. It is a joint project with Harvard Extension School and NASA.
  • NASA developed the Fly’s Eye GLM Simulator (FEGS), a multi-spectral radiometer system with 30 radiometers and an HD camera, to validate the Geostationary Lightning Mapper (GLM) on the GOES-16 satellite. It's mounted on the NASA ER-2 aerial laboratory, a plane which flies at 70,000 feet. During a 2017 flight campaign, the ER-2 collected data using FEGS and the Cloud Physics LiDAR (CPL) to measure cloud heights.
  • While LiDAR provides precise cloud-top heights, it offers only single-point values. Cloud2Cloud aims to develop a predictive computer vision model that combines high-definition images from FEGS with LiDAR data to estimate cloud-top heights accurately and create a three-dimensional height field.
  • Proposal for the project is located here.

NLP

Vision for safety inspections

Vision

Visualization

  • JavaScript D3 Visualization project on Fake News mostly focuses on COVID-19 propaganda (requires Chrome or Firefox on desktop). It was selected as best project for the class in my Masters program.
  • I recorded the 2-minute video for the project in an old-timey mid-Atlantic accent for uh, fun.

Clarifai Blogs

  • I've written about 60 blog posts for Clarifai. They can all be found here. Below are a few samples. I also maintained the Clarifai documentation for quite two years, so much of the newer content on their docs site was written by me using Meta's "Docusaurus" platform.
  • Blog post on AI bias
  • Creating AI workflows post
  • Clarifai Quick Start post

Clarifai Videos

Promotional Clarifai Videos

  • I've created slick promotional loops used at tradeshows using Adobe After Effects.
    • Digital Asset Management promotional video
    • Retail AI promotional video

Virmuze

  • Virmuze is a startup of mine that I worked on for a while. The National Security Agency (NSA) uses it to host the National Cryptologic Museum's online exhibits. It's an unusual point of pride for me as I also helped them create much of the online exhibit content during the COVID-19 pandemic.
  • Link to Virmuze on nsa.gov (it's the colorful footprint icon next to the Twitter logo)
  • Link to the museum itself on Virmuze

Database design

  • I developed a systems project for a research class in big data systems in C++.
  • It's a fully functional, modern LSM-tree (Log Structured Merge tree) write-optimized NoSQL key-value store. It supports tiered, leveled, lazy-leveled, and partial compaction by percentage level policies. It also offers MONKEY (Monkey: Optimal Navigable Key-Value Store) bloom filter optimization, internally multi-threaded range queries and compaction using a threadpool, and is also externally multi-threaded and can support multiple clients concurrently accessing the database with per-level blocking.
  • Final report is located here
  • A literature review on LSM tree key value stores is located here.

Teaching

  • Teaching fellow for Fall 2023, CSCI E-89C Deep Reinforcement Learning.
  • I teach a weekly section on foundational and advanced concepts in reinforcement learning and deep learning. I also grade assignments, and answer questions via class forum and email.
  • Reinforcement learning topics include Markov Decision Processes (MDP), dynamic programming with the Bellman Equation, application of Monte Carlo methods in reinforcement contexts, temporal-difference Prediction & Control, including SARSA and Q-learning techniques, n-step TD and various Approximation Methods like stochastic-gradient, semi-gradient TD update, and Least-squares TD.
  • Deep learning topics include techniques and principles behind training neural networks using backpropagation, strategies for tuning neural networks, with a focus on regularization, convolutional neural networks (CNNs) and recurrent neural networks (RNNs).
  • Deep reinforcement learning topics include value-based deep RL using Q-networks, policy-based approaches in Deep RL with REINFORCE, asynchronous methods for deep RL, with a spotlight on advantage actor-critic (A2C).

Retrieval Augmented Generation (RAG)

  • I built a custom RAG system with an LLM that scrapes and answers questions on entire websites using LlamaIndex, Weaviate, LangChain, and GPT-3.5. It's hosted on Google Cloud Services and Google Cloud Storage, and uses Docker and Kubernetes for production use. As well, the project hosts a fine-tuned BERT model on Google Vertex for classification of the generated text, and the entire thing runs FastAPI on the backend and React in the frontend.
  • Video RAG Detective: Retrieval Augmented Generation with website data
  • Medium post
  • GitHub repo

Harvard Extension Masters

  • I'm a degree candidate for an ALM in Data Science, and have finished the 11 courses for my masters, with only the final capstone project remaining to be completed in December, 2024. I have maintained a 4.0 GPA in the following 11 classes:

    • Data Modeling (R)
    • Foundations of Data Science and Engineering (Python, SQL, Tableau)
    • Deep Learning for NLP (Research, Python, PyTorch)
    • Computer Vision (Python, Keras/TensorFlow)
    • Deep Reinforcement Learning (Python)
    • Elements of Data Science and Statistical Learning with R (R)
    • Time Series Analysis with Python (Python)
    • Visualization (D3 JavaScript, HTML, CSS, Tableau)
    • Big Data Systems (Research, C++)
    • Productionizing AI (MLOps): AC215
    • Pre-capstone proposal (cloud2cloud)

Contact

Feel free to reach out to me: