9/2022 - Present. AI Scientist II at the National Irish Center for Applied AI. As a core member of the AI team for ICOS, an AI ecosystem for edge devices, I specialize in multi-modal learning, online learning, federated learning, and computer vision. My responsibilities include developing AI APIs, utilizing High Performance Computing (HPC), and implementing DevOps practices such as Docker and SLURM/Singularity. I actively support both national and EU innovation projects, focusing on data modeling and enhancing machine learning model deployment.
01/2022 - 01/2024. Fellow in Computer Science at Harvard University. I developed and assessed Computer Vision and Deep Learning algorithms for imaging-based single-cell methods projects, specifically CyCIF using high-dimensional microscopy images, within Harvard’s Visual Computing Group and the Harvard Medical School’s Lab of Systems Pharmacology. My work involved instance segmentation algorithms, unsupervised learning, image translation models, and unbiased pixel-level feature extraction for analyzing lateral spillover in segmented images of tissue. Supervised by Hanspeter Pfister and Siyu Huang at the Harvard John A. Paulson School of Engineering and Applied Sciences.
01/2022 - 08/2022. Machine Learning Intern at EcoVadis. I developed an innovative attribution strategy leveraging Contrastive Language-Image Pretraining (CLIP) to enhance logo identification and verification within EcoVadis’s production environment. This project combined YOLO and transformers for advanced text-image training and multi-modal logo identification. I also designed a self-supervised method for efficient logo annotation, curating over 20,000 images to refine the dataset, which significantly improved logo recognition in documents under the guidance of Sophia Katrenko in Paris, France.
05/2020 - 10/2020. Computer Vision and Artificial Intelligence Engineer at AI India Innovation Centre. I led the “Land Use Land Cover Classification using Advanced ML and DL Techniques for Larger Study Area” project. My primary tasks included accuracy assessment, satellite imagery dataset creation from scratch, model training and testing, as well as utilizing geospatial data and cloud services, including GPU and E2E cloud services, and Google Colab, based in Valencia, Spain.
2019. Astronomy and Physics Teacher. My role involved educating 400+ students about astronomy and physics at Camp Poyntelle, Pennsylvania, U.S.A.
2016 - 2018. IT Network Technician. Responsible for documenting all systems and network adjustments and ensuring the maintenance and installation of hardware and software for telecommunication and network devices across various faculties of the University of Cauca, Colombia.
2018. Computer Science Teacher. Developed web and mobile applications as part of the general literacy curriculum program for refugees at Social Hackers Academy in Athens, Greece.
2018. Full-Stack Software Developer. Designed and developed the official webpage for a psychiatric center in Athens, Greece, at Althaia Psychiatric Centre.
10/2013 - Present. General Director & CEO. I serve as an astronomy interpreter and public speaker, overseeing financial resources and logistics at the Astronomical Observatory Francisco José de Caldas, Colombia.
EDUCATION
2020 - 2022. MSc. Joint Master in Image Processing and Computer Vision [IPCV][News]. Awarded a full scholarship from the Erasmus Mundus Joint Master (EMJM). This program involved collaboration among The University of Bordeaux, the Autonomous University of Madrid, and Peter’s Catholic University of Pazmany. Location: Spain - France - Hungary.
01/2022 - 08/2022. Machine Learning Intern at Harvard University. I contributed to the Hanspeter Pfister Lab by improving instance segmentation and deep clustering models for cancer cell discovery, utilizing the FASRC Computing Cluster (HPC). I had the privilege of collaborating with esteemed researchers such as Siyu Huang, Edward Novikov, and Hanspeter Pfister at the Harvard John A. Paulson School of Engineering and Applied Sciences.
2020. Data Scientist at Correlation One. DS4A/Colombia 4.0. Data Science for All (DS4A)/Colombia 2.0, highly tailored program focused on Data Science and Artificial Intelligence - Sponsored by The Ministery of Information and Communication Technologies of Colombia (MinTIC) with Correlation One
2014 - 2019. B.A. in Electronics and Telecommunications Engineering from the University of Cauca. Ranking: Top 3%.
2023. Google GCP grant winner for quantum embedding ML image classification . [Code]
2022. Hack2hack first place biodiversity winner. [Link]
2021. OpenCV 2021 Competition Phase 1 Winner: Finalist in the world’s largest spatial AI competition, sponsored by Microsoft Azure and Intel, chosen from over 1,400 submissions, earning 6 OAK-D devices.
2021. Best President, Aerospace & Electronics Systems Society Colombia: Recognized as AESS Best President 2020, with the chapter also awarded as the best student chapter in Colombia. [Link]
2021. €5,000 Awarded Project for Dengue Forecasting: Towards a Smart Eco-epidemiological Model of Dengue in Colombia using Satellite in Collaboration with MIT Critical Data Colombia. Supported by ESA Network of Resources Initiative. [Data][Slides][HuggingFace Mantainer]
2020. Embry Riddle MSc in Aerospace Engineering Scholarship: Graduate Teaching Assistantship in the Advanced Dynamics and Control Lab, with projects sponsored by NASA & SBIR/STTR Technologies.
2020. Hult-Prize Regional Awarded Finalist in Popayán and Monterrey, Mexico.
2020. First Place Winner in IEEE R9 Humanitarian Activities: Ozone Purifier Project (US $5,200), awarded for creating a web and mobile application. Co-authored with Giovanna Ramirez, Germán Cambuya, Santiago Chicangana, Camilo Segura, and Jesús Gurrute.
2020. First Prize in IEEE SIGHTS COVID-19: Developed a mechanical ventilator (US $3,914 award). [Link]
2020. Data Science for All (DS4A) winner, joining the AI task force of the Colombian government (2020). organized by #MinTIC and #CorrelationOne.
2020. Best Impact Global Award for COVID-19 at AAPM Hackathon. [Link]
2019. Excellence Award in the Regional Huawei ICT Competition in Bogotá, Colombia.
2016 - 2017. Half Honorific Scholarship for B.S. at the University of Cauca.
Logo verification and characterization with machine learning for document analysis. [Online]. EcoVadis - Paris, France.
Mechanism for the characterization of motion artifacts in photophethysmography signals under low-intensity movements for tachycardia and bradycardia eventsTachycardia and Bradycardia Detection using Wearable Photoplethysmography under low-intensity Motion Artifacts. [Online]. University of Cauca - Popayan, Colombia.
PUBLICATIONS
2024. S. A. Cajas, J. Samanta, A. L. Suárez-Cetrulo, and R. S. Carbajo, “Adaptive Machine Learning for Resource-Constrained Environments, “ in Discovering Drift Phenomena in Evolving Landscape (DELTA 2024) Workshop at ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD 2024), Barcelona, Catalonia, Spain, Aug. 26, 2024. [Zenodo]. [Online]
2024. Moukheiber, D., Restrepo, S. A. Cajas, .. & L. A. Celi (2024). A multimodal framework for extraction and fusion of satellite images and public health data. Scientific Data, 11(1), 634. [Online]
2024. Cajas, S. A., Restrepo, D., Moukheiber, D., Kuo, K. T., Wu, C., Garcia Chicangana, D. S., Paddo, A. R., Moukheiber, M., Moukheiber, L., Moukheiber, S., Purkayastha, S., Lopez, D. M., Kuo, P., & Celi, L. A. (2024). A Multi-Modal Satellite Imagery Dataset for Public Health Analysis in Colombia (version 1.0.0). PhysioNet. [Paper][code]
2024. Kuo, K.T., Moukheiber, Cajas, S. A., S.C., Restrepo, D., Paddo, A.R., Chen, T.Y., Moukheiber, L., Moukheiber, M., Moukheiber, S., Purkayastha, S. and Kuo, P.C., 2024. DengueNet: Dengue Prediction using Spatiotemporal Satellite Imagery for Resource-Limited Countries. arXiv preprint arXiv:2401.11114. [online]
2023. A. L. Suárez-Cetrulo, Cajas, S. A., J. Samanta. “Intelligence Layer - ICOS project Architecture White Paper”, [Paper]
2023. Xie, K., Huang, S., Cajas, S. A., Pfister, H. and Wei, D., 2023. S3-TTA: Scale-Style Selection for Test-Time Augmentation in Biomedical Image Segmentation. arXiv preprint arXiv:2310.16783. [Paper]
2021. Cajas, S., Astaiza, P., Garcia-Chicangana, D.S., Segura, C. and Lopéz, D.M., 2020, September. ECG Arrhythmia Classification Using Non-Linear Features and Convolutional Neural Networks. In 2020 Computing in Cardiology (pp. 1-4). IEEE. [Paper][Code]
2020. Cajas, S.A., Landínez, M.A. and López, D.M., 2020, January. Modeling of motion artifacts on PPG signals for heart-monitoring using wearable devices. In 15th International Symposium on Medical Information Processing and Analysis (Vol. 11330, pp. 320-334). SPIE. [Paper][Code]
2020. “Colcart - Web and mobile application to market and produce organic products in the department of Cauca”.[Online]
2020. “Design and implementation of a satellite model for the Ibero-American Canned Satellite Competition”. [Online]
UPCOMING:
2024. S. A. Cajas, J. Samanta, A. L. Suárez-Cetrulo. A comprenhensive review of Machine Learning at the Edge. [soon!]
2024. Restrepo, D., Wu, C., Cajas, S. A., Nakayama, L. F., Celi, L. A., & López, D. M. (2024). Multimodal deep learning for low-resource settings: A vector embedding alignment approach for healthcare applications. [Online]
2024. A. L. Suárez-Cetrulo, Cajas, S. A., J. Samanta, and R. S. Carbajo, “Machine Learning under Drifts and Shifts,” in Proceedings of the International Conference on Machine Learning, ISBN: 978-1-83508-358-1, 2023.[Code]
2023. “G. J. Baker, E. Novikov, Y.-A. Chen, C. B. Hug, Cajas, S. A., S. Huang, C. Yapp, S. Coy, H. Pfister, A. Sokolov, and P. K. Sorger, “Contextual Cell State Inference in Whole-slide Multiplex Images of Tissue with Deep Convolutional Neural Networks,” IEEE Transactions on Medical Imaging, vol. 40, no. 7, pp. 1908-1919, Jul. 2021.” [on-review]
SEMINARS & BLOGS
2024. Poster. Adaptive Machine Learning for Resource-Constrained Environments. Discovering Drift Phenomena in Evolving Landscape (DELTA 2024), Workshop at ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2024), Barcelona. CeADAR Ireland. [Online]
2023. “Unlocking the Potential of Machine Learning at the Edge: Challenges and Future Trends”. By CeADAR: Sebastián Andrés Cajas, Jaydeep Samanta, Andrés L. Suárez-Cetrulo, Ricardo Simón Carbajo. [Online]
2021. Satellite imagery for NASA Space Apps challenge 2021. Conference with AESS Unicauca. April 23-25. [Conference]
VOLUNTEERING AND OPEN-SOURCE PROJECTS
2024: Co-Lead, Quantum-Based Initiative for Classification and Optimization [Qubico].
2024: (Book) - Machine Learning for Drifts and Shifts.
2024: Vector Embeddings for Quantum Mechanics: Optimizing latent space using quantum computing [Code]
2020-present: NASA SpaceApps Leader at University of Cauca and AESS-Unicauca, Colombia. [Website]
2020-present: Open-source research Data Scientist, MIT Critical data Mentor. PI: Leo Celi. [Website]
2021-present: Advisory Board Member, Ex-President, Founder of AESS Unicauca, University of Cauca. [Website].
2020: Satellite Extractor: Dockerized API for downloading satellite imagery, developed with MIT Critical Data Colombia and sponsored by SentinelHub. [Code][Datasets][Tutorials]
2020: Computer Vision Researcher (Open-source): Smart indoor positioning system for the visually impaired
2020: IEEE Human Sights: Mechanical ventilator project, sponsored by IEEE Humanitarian Sights. [Website][Code][Mobile App]
ML & AI: Proficient in ML frameworks (scikit-learn, TensorFlow, PyTorch); ML DevOps; optimized pipelines & real-time inference. Unsupervised, Self-supervised, 3D CV, GANs, LLMs.