(#) Mateo Restrepo
_ML Engineer / Quantitative Developer_ • [mr324@cornell.edu](mailto:mr324@cornell.edu) • [LinkedIn Profile](https://linkedin.com/in/mateorestrepo) • [Github](https://github.com/cuckookernel)
(##) Summary
PhD-level Machine Learning Engineer with 8+ years of experience building predictive models.
Proven track record in commercial banking, proptech, and high-frequency trading.
Expert in combining advanced mathematics, data engineering, and programming to drive impactful business solutions.
(#) 👷 Professional experience
(##) [La Haus](https://www.lahaus.com) (2019/4 – 2024/1)
_Principal ML Engineer_
* Developed a RAG question-answering bot to automate first contact and profiling of new homebuyer leads.
* Using OpenAI's LLM APIs, developed an internal hackaton wining back-end for a Whatsapp chatbot, that profiledg of new homebuyer leads and generated a personalized sales pitch generation for a selected residential development,
* Did 100% of the design and carried out 70% of the implementation, using _Feast_, of an internal feature store covering residential listings and homebuyers. The implementation included batch and online data-sources, feature views, etc.
* Developed personalized property recommendation algorithms for home buyers, as well as property similarity algorithms based on (deep-learning produced) embeddings.
* Implemented DL algorithms for image quality scoring and classification, also for automated captioning
(##) GRUPO BANCOLOMBIA S.A.
_Analytic Capabilities Manager_, Medellín, Colombia (2014/8 – 2018/4)
* Developed an item-based recommendation engine to suggest Point-of-Sale businesses for debit and debit/credit-card holders.
* Created an income-estimation model (via regression RFs) for individuals as one of the cornerstones for remaking the bank’s credit preapproval system.
* Participated and contributed to all phases of the creation of the new credit scoring system: ETL, clean-up, ML model architecture definition, training, hyperparameter tuning, evaluation, automation and deployment to production. All using open source tools.
* Contributed to the creation of classification models to predict customer churn.
* Built NLP based automatic CV-scorer (through NMF) for detecting good data analyst/scientist profiles.
* Provided instruction to bank’s coworkers in SQL, Python and Spark for data analysis and ML.
(##) BENCHMARK SOLUTIONS
_Assistant Vice President_ - New York, NY (2012/8 – 2013/3)
* Wrote a Scala-based real-time bond pricing ML model to combine prices coming from a Kalman filter with other factors to produce an alternative version of prices aimed at market-makers.
* Also worked in evaluating accuracy metrics for the prices generated by said filter.
(##) GOLDMAN SACHS
_Vice President – Strategist_, New York, NY (2011/6 – 2012/3)
* Optimized and fine-tuned Java based CME FX listed options quoting engine: price interpolation routines, mass quote interface, garbage collection.
* Developed FX options market data enrichment application: real-time implied vol and greeks computation, results shipped to high throughput SQL-based data store (Sybase-IQ).
(##) MERILL LYNCH
*Vice President – Quantitative Developer*, New York, NY (2011/1 – 2011/6)
*Associate – Quantitative Analyst*, New York, NY (2008/10 – 2011/1)
* Collaborated in the automated market making group for listed (equity) options
* Developed, among other things, an algorithm for smart deflection of prices from trades for optimal inventory management.
* Worked on implied volatility estimation via numerical methods and volatility smile fitting.
(#) 👨🎓 Education
(##) 🤖 Masters in Artificial Intelligence (2024/1 – Present)
In progress... (45% done by 2024/6)
* Formalizing knowledge acquired empirically / informally during last 12 years.
* Self-updating in newer ML and DL trends including, Computer Vision and NLP
* Networking.
(##) 🧮 PhD in Applied Mathematics
_Cornell University_, Ithaca, NY (2003 – 2008)
* GPA: 4.1 / 4.0
* _Thesis_: Computational Methods for Static Allocation and Real-Time Redeployment of Ambulances. Used value function approximation methods and Monte Carlo simulation to produce good policies for relocating ambulances in real-time.
* ML / Stats and Reinforcement Learning related coursework:
- Probability and Stochastic Processes
- Mathematical programming (linear optimization)
- Simulation (w. variance reduction) · Time Series analysis
- Continuous optimization · Dynamic programming
- Mathematical statistics · Matrix computations
(##) ⚛ BSc in Physics
_Universidad Nacional de Colomiba_, Bogotá, Colombia 1998 – 2002
* Financed 100% through merit scholarship (Colombian Oil Company)
* GPA: 4.89 / 5.00
(##) 🛠 SKILLS AND CERTIFICATIONS
(###) Programming languages & Libraries
* **Core Languages:** Python, Rust, Scala, C/C++
* **Strong knowledge:** SQL (Snowflake / Impala / Postgres), STL, MATLAB, Spark
* **Medium Knowledge:** R, Perl, Q/Kdb, Boost, LaTeX
(###) ML, Mathematics and operations research:
* Machine Learning, regression, classification, collaborative filtering.
* Big data manipulation and analysis in Hadoop with Impala / Spark.
* Simulation-optimization, approximate dynamic programing (aka. reinforcement Learning), Monte Carlo simulation with variance reduction, probability.
* Scientific computing and numerical methods for matrix computations
* Continuous and discrete optimization, network flow algorithms.
* Actuarial science (5 of 5 SOA exams passed)
(###) Selected MOOCs
* Machine Learning Specialization (5 Courses - Coursera - DeepLearning.AI - 2018)
* Artificial Intelligence Nanodegree (Udacity - 2019)
* Data Manipulation at Scale (U. of Washington - Coursera - 2017)
* Statistical Learning (Stanford Online - 2014)
(###) Natural Languages
**English:** fully-bilingual. **Spanish:** native proficiency, **German:** written/read good, spoken
rusty.