Mishig Davaadorj

dmishig@gmail.com

I’m a software & ML engineer. I received my bachelor’s degree in computer science from Colorado College in 2019. Since the summer of 2021, I’ve been part of a team building the Hugging Face Hub and contributing to Hugging Face’s open-source Machine Learning libraries.

Here are some of the highlights:

Built various aspects of Hugging Face Hub, making it the de facto platform for sharing models and datasets. Used typescript, svelte, tailwindcss, mongodb, express. The features I’ve worked on are:

diffusers#559: jax/flax implementation of Stable Diffusion 1.1.
tokenizers#890: improve serialization/deserialization of tokenizers through Rust macros that implement necessary serde traits.
huggingface/chat-ui: UI for chatting with LLMs (used svelte, tailwindcss, mongodb, typescript). Supports tool calling (image generation) & RAG (websearch, document extraction).
transformers#13828: image segmentation pipeline implementation for facebook/detr-resnet-50 & other models that can do image segmentation.
tokenizers#976: parallelize unigram tokenization trainer using rayon, a popular Rust parallelization crate.
lerobot#277: robotics dataset (videos & sensor signals) visualizer for easily testing & debugging real-life robitics.
huggingface/doc-builder: python package that parses markdown files + jupyter notebooks + python docstrings (through inspect) and creates docs websites (using svelte), powering hf.co/docs.
huggingface/gguf.js: js GGUF parser that works on remotely hosted files. GGUF is a weights file format, created by Georgi Gerganov (the creator of llama.cpp).
diffuse-the-rest: a web app that uses Stable Diffusion (diffusers#559 backend) to turn sketches into higher-quality images. One of the first apps to go viral that used Stable Diffusion (Aug 2022).