(#) Mateo Restrepo _ML Engineer / Quantitative Developer_   •  [mr324@cornell.edu](mailto:mr324@cornell.edu)  •   [LinkedIn Profile](https://linkedin.com/in/mateorestrepo)   •   [Github](https://github.com/cuckookernel) (##) Summary PhD-level Machine Learning Engineer with 8+ years of experience building predictive models. Proven track record in commercial banking, proptech, and high-frequency trading. Expert in combining advanced mathematics, data engineering, and programming to drive impactful business solutions. (#) 👷 Professional experience (##) [La Haus](https://www.lahaus.com) (2019/4 – 2024/1)
_Principal ML Engineer_ * Developed a RAG question-answering bot to automate first contact and profiling of new homebuyer leads. * Using OpenAI's LLM APIs, developed an internal hackaton wining back-end for a Whatsapp chatbot, that profiledg of new homebuyer leads and generated a personalized sales pitch generation for a selected residential development, * Did 100% of the design and carried out 70% of the implementation, using _Feast_, of an internal feature store covering residential listings and homebuyers. The implementation included batch and online data-sources, feature views, etc. * Developed personalized property recommendation algorithms for home buyers, as well as property similarity algorithms based on (deep-learning produced) embeddings. * Implemented DL algorithms for image quality scoring and classification, also for automated captioning (##) GRUPO BANCOLOMBIA S.A.
_Analytic Capabilities Manager_, Medellín, Colombia (2014/8 – 2018/4)
* Developed an item-based recommendation engine to suggest Point-of-Sale businesses for debit and debit/credit-card holders. * Created an income-estimation model (via regression RFs) for individuals as one of the cornerstones for remaking the bank’s credit preapproval system. * Participated and contributed to all phases of the creation of the new credit scoring system: ETL, clean-up, ML model architecture definition, training, hyperparameter tuning, evaluation, automation and deployment to production. All using open source tools. * Contributed to the creation of classification models to predict customer churn. * Built NLP based automatic CV-scorer (through NMF) for detecting good data analyst/scientist profiles. * Provided instruction to bank’s coworkers in SQL, Python and Spark for data analysis and ML. (##) BENCHMARK SOLUTIONS _Assistant Vice President_ - New York, NY (2012/8 – 2013/3)
* Wrote a Scala-based real-time bond pricing ML model to combine prices coming from a Kalman filter with other factors to produce an alternative version of prices aimed at market-makers. * Also worked in evaluating accuracy metrics for the prices generated by said filter. (##) GOLDMAN SACHS _Vice President – Strategist_, New York, NY (2011/6 – 2012/3) * Optimized and fine-tuned Java based CME FX listed options quoting engine: price interpolation routines, mass quote interface, garbage collection. * Developed FX options market data enrichment application: real-time implied vol and greeks computation, results shipped to high throughput SQL-based data store (Sybase-IQ). (##) MERILL LYNCH *Vice President – Quantitative Developer*, New York, NY (2011/1 – 2011/6)
*Associate – Quantitative Analyst*, New York, NY (2008/10 – 2011/1) * Collaborated in the automated market making group for listed (equity) options * Developed, among other things, an algorithm for smart deflection of prices from trades for optimal inventory management. * Worked on implied volatility estimation via numerical methods and volatility smile fitting. (#) 👨‍🎓 Education (##) 🤖 Masters in Artificial Intelligence (2024/1 – Present) In progress... (45% done by 2024/6) * Formalizing knowledge acquired empirically / informally during last 12 years. * Self-updating in newer ML and DL trends including, Computer Vision and NLP * Networking. (##) 🧮 PhD in Applied Mathematics _Cornell University_, Ithaca, NY (2003 – 2008) * GPA: 4.1 / 4.0 * _Thesis_: Computational Methods for Static Allocation and Real-Time Redeployment of Ambulances. Used value function approximation methods and Monte Carlo simulation to produce good policies for relocating ambulances in real-time. * ML / Stats and Reinforcement Learning related coursework: - Probability and Stochastic Processes - Mathematical programming (linear optimization) - Simulation (w. variance reduction) · Time Series analysis - Continuous optimization · Dynamic programming - Mathematical statistics · Matrix computations (##) ⚛ BSc in Physics _Universidad Nacional de Colomiba_, Bogotá, Colombia 1998 – 2002 * Financed 100% through merit scholarship (Colombian Oil Company) * GPA: 4.89 / 5.00 (##) 🛠 SKILLS AND CERTIFICATIONS (###) Programming languages & Libraries * **Core Languages:** Python, Rust, Scala, C/C++ * **Strong knowledge:** SQL (Snowflake / Impala / Postgres), STL, MATLAB, Spark * **Medium Knowledge:** R, Perl, Q/Kdb, Boost, LaTeX (###) ML, Mathematics and operations research: * Machine Learning, regression, classification, collaborative filtering. * Big data manipulation and analysis in Hadoop with Impala / Spark. * Simulation-optimization, approximate dynamic programing (aka. reinforcement Learning), Monte Carlo simulation with variance reduction, probability. * Scientific computing and numerical methods for matrix computations * Continuous and discrete optimization, network flow algorithms. * Actuarial science (5 of 5 SOA exams passed) (###) Selected MOOCs * Machine Learning Specialization (5 Courses - Coursera - DeepLearning.AI - 2018) * Artificial Intelligence Nanodegree (Udacity - 2019) * Data Manipulation at Scale (U. of Washington - Coursera - 2017) * Statistical Learning (Stanford Online - 2014) (###) Natural Languages **English:** fully-bilingual. **Spanish:** native proficiency, **German:** written/read good, spoken rusty.