My academic research interests include
Thiago Correia Pereira (2023). Innovative data collection framework for humanitarian logistics: A serious game solution (co-supervised with Marie-Ève Rancourt)
Maria Clara Martins (2023). Data-driven methods for inventory management in dock-based bike-sharing systems (co-supervised with Sanjay D. Jena)
Quentin Fournier (2022). Machine learning for anomaly detection in kernel traces.
Irving Muller Rodrigues (2022). Algorithms and learning models for bug report deduplication (co-supervised with Michel Dagenais)
Iman Kohyarnejad (2022). System performance anomaly detection using tracing data analysis (co-supervised with Michel Dagenais)
Rodrigo Alves Randel (2021). Optimization methods to enhance constraint-based semi-supervised clustering. (co-supervised with Alain Hertz).
Leandro Rochink Costa (2021). Workload optimization for swarm-powered ad-hoc clouds. (co-supervised with Andrea Lodi).
Diego Rocha Lima (2021). Atratividade visual em roteamento de veículos através de otimização bi-objetivo. (co-supervised with Claudio C. Contardo)
Daniel Nobre Pinheiro (2020). Modelo fuzzy e convexo para agrupamento de dados por k-medoides.
Bruno Jefferson de Sousa Pessoa (2017). Problema das sequências justas ponderadas.
Nielsen Castelo Damasceno (2016). Novas estratégias para resolver o problema da degeneração no algoritmo k-means.
Éverton Santi (2014). Problema das p-medianas heterogêneo: modelos e algoritmos (co-supervised with Simon J. Blanchard - Georgetown University)
Robin Moine (2023). Comparaison empirique de deux stratégies de recherche locale.
Adem Aouni (2022). Utilisation d’un espace latent pour une déduplication de bogues rapide.
Emeric Courtade (2022). Importance des variables et régressions statistiques imitant l’heuristique de branchement fort dans un problème de rotation d’équipages
Pierre Pereira (2022). Learning to branch for the crew pairing problem (co-supervised with François Soumis)
Kim Thuyen Ton (2021). Using a diversity criterion to select training sets for machine learning models. (co-supervised with Claudio C. Contardo)
Isabelle Bouchard (2021). Building damage assessment after a natural disaster in emergency contexts: A deep learning approach. (co-supervised with Marie-Ève Rancourt)
Mohammed Najib Haouas (2020). Résolution exacte du problème de partitionnement de données avec minimisation de variance sous contraintes de cardinalité par programmation par contraintes (co-dirigé avec Gilles Pesant)
Théo Moins (2020). Modèle hybride combinant réseau de neurones convolutifs et modèle basé sur le choix pour la recommandation de sièges
Nicolas Heutte (2020). A divide-and-conquer approach to employee scheduling. (co-supervised with Guy Desaulniers)
Laurent Boucaud (2019). Mécanismes dáttention pour les modèles convolutifs dans le cadre de la prédiction de trajectoire. (co-supervised with Nicolas Saunier)
Pierre Hulot (2018). Towards station-level demand prediction for effective rebalancing in bike-sharing systems (co-supervised with Sanjay D. Jena)
Allyson Fernandes Silva (2017). Um algoritmo evolucionário para o problema dinâmico de localização de facilidades com capacidades modulares. (co-supervised with Caroline Rocha)
Thiago Correia Pereira (2017). Novas heurísticas para o agrupamento de dados pela soma mínima de distâncias quadráticas.
Daniel Nobre Pinheiro (2016). O problema de clustering heterogêneo fuzzy: Modelos e heurísticas.
Rodrigo Alves Randel (2016). Utilização do problema das p-medianas como critério para o agrupamento de dados semi-supervisionado.
Leandro Rochink Costa (2016). Abordagem heurística baseada em Busca em Vizinhança Variável para o Agrupamento Balanceado de Dados pelo Critério da Soma Mínima das Distâncias Quadráticas.
Adriana Cavalcante Marques (2016). Revisão de tarifa do Sistema Elétrico Brasileiro com a aplicação do modelo de redes da Análise Envoltória de Dados - NDEA.
Fernanda Barreto Rocha (2015). Modelos Dinâmicos de Análise de Envoltória de Dados: Revisão da Literatura e Comparação de Modelagens.
Kayo Gonçalves e Silva (2013). Escalabilidade de uma Implementação Paralela do Simulated Annealing Acoplado.
Herica Debora Praxedes de Freitas (2013). Desenvolvimento de um modelo de simulação de ocupação de leitos para um hospital privado de Natal-RN.
Diogo Robson Monte Fernandes (2012). Algoritmos de otimização para decisões de localização de facilidades e distribuição em sistemas multinível de transporte rodoviário de carga.
Saulo de Tarso Alves Dantas (2012). Desenvolvimento de um modelo para resolver um problema real de roteirização do tipo dial-a-ride.
Isaac Franco Fernandes (2010). Algoritmos para o problema de localização de uma facilidade com distâncias limitadas e restrições de atendimento.
This project focus on predicting the hourly demand for demand rentals and returns at each station of the BIXI's bike sharing system. Our proposed model uses temporal and weather features to predict the main characteristics of the traffic demand (e.g. mean and variance of trips from/to each station). The model first extracts the main traffic behaviors from the bike stations. These simplified behaviors are then predicted and used to perform station-level predictions based on machine learning and statistical inference techniques. Finally, the model determines inventory intervals, which are often used by bike sharing companies for their online rebalancing operations.
The communication and computing infrastructure has evolved through the years, getting more efficient, sophisticated, integrated and networked. Newer mobile devices (including smart robots or autonomous cars) and servers often contain 8 or more cores in their central processing unit. These systems are based on heterogeneous processors, with efficient traditional central processing units, but also with co-processing units optimised for graphics (GPGPUs with thousands of cores), networking, signal processing or even for Machine Learning. These co-processing units are highly parallel and often contain over 8 billion logic elements (transistors) each. Adding to this complexity is the increasing reliance on virtualisation, which hides the specificities of the hardware, allowing an application to run on several different processor models, but makes the performance more difficult to analyse. As a result, even a simple operation such as initiating a phone call, making a Web search, routing a packet or displaying a video frame, can involve many parallel cores on more than one processing unit, possibly on several servers. Moreover, the same operation, a few seconds later, may be served in a different way by different cores and physical servers. Therefore, understanding the performance of these operations has become extremely difficult and the tools for that purpose are severely lacking. In this project, the tracing, monitoring, profiling and debugging tools for manycore systems will be extended to efficiently extract information from all units in all layers, from the hardware to the application, and cope with the large number (several thousands) of cores. Furthermore, new methods and algorithms will be developed to automate the analysis of the extracted monitoring data. As a result, the designers and operators of distributed applications on mobile devices, cloud servers and other heterogeneous computing systems, will have the tools in hand to quickly analyse their system performance, automatically or manually find problems, and optimise operations.
Our current speed of data generation combined with storage capacity increases have given rise to new paradigms in computing. For example, according to the IBM's website, there are approximately 695,000 status updates and 11 million instant messages sent every minute on Facebook. However, many organizations have faced the problem of having a lot of data, but poor knowledge about them. Clustering methods help to automatically identify unobserved groups for a set of data objects, and are currently being radically transformed by the size, the variety and the nature of the available data, i.e., by so-called Big Data. This research program focuses on the development of scalable algorithms for Big data clustering. This will be achieved both by: (i) redesigning well-known successful serial algorithms for scalability, leveraging their main theoretical ideas; and (ii) developing new algorithms and heuristics using new programming paradigms associated with Big Data. In summary, the objectives of this research program are : A. Produce an extensive survey of the existing Big Data clustering methods in order to provide a complete panorama about which ones can be adapted to approach Big data. B. Redesign successful serial clustering algorithms, leveraging their main theoretical ideas to work with the new programming paradigms and computational tools from Big Data. C. Develop algorithms for semi-supervised Big Data clustering, incorporating supplementary information provided by the user into the clustering decision process. D. Develop Big Data clustering algorithms for new emerging applications. E. Provide a repository of Big Data software and clustering algorithms with guaranteed effectiveness. The software and algorithms developed in this research program are expected to constitute new benchmarks to the field, allowing larger datasets to be tackled with efficiency and effectiveness, leading to new insights in commerce, industry and academia.Moreover, given the Big Data expert shortage in Canada and worldwide, this research program will help to form and train specialists who will be able to master the scientific and technological issues emerging from the Big Data explosion. In the long term, the findings of this research program will support data mining-driven decision making in the Internet of Things (IoT) era in which huge amounts of data are generated in real-time from the most varied sources and devices (e.g. vehicles, sensors, home appliances, etc.). By combining massive data processing, machine learning techniques and algorithms, IoT devices will be able to perform clustering as well as other classification tasks on a large scale.