IMScience Tech Talks can be seen as a space for exchanging ideas and experiences in subjects related to image and multimedia data analysis. Thanks to several cooperation projects, like CAPES/COFECUB, STIC-AMSUD, Equipe Associé, among others, this kind of initiative is possible. Thanks my friends.
|Speaker||University or Research center||Talk||Date|
|Laurent Najman||ESIEE Paris||A Tutorial on Applications of Power Watershed Optimization to Image Processing||April 09, 2021|
|Paulo Miranda||Universidade de São Paulo (USP)||Efficient Hierarchical Graph Partitioning for Image Segmentation by Optimum Oriented Cuts||May 07, 2021|
|Davide Bacciu||University of Pisa||A Gentle Introduction to Deep Learning for Graphs||May 21, 2021|
|Benjamin Perret||ESIEE Paris||Higra: Hierarchical Graph Analysis||June 04, 2021|
|Philippe Salembier||Universitat Politècnica de Catalunya. BARCELONATECH (UPC)||Processing Radar Images with Hierarchical Region-Based Representations and Graph Signal Processing Tools||June 18, 2021|
|Jesus Perez-Martin||Universidad de Chile||Video-Text Translation for Bridging Vision and Language||August 13, 2021|
|Thiago Spina||Laboratório Nacional de Luz Síncroton||High Throughput Visualization and Segmentation of Large-scale Volumes in X-ray Microscopy Imaging||August 27, 2021|
|Juan Manuel Barrios||ORAND and Impresee||Impresee: Innovation based on Artificial Intelligence and Computer Vision||September 24, 2021|
|Alexandru Telea||Utrecht University||Visualizing the Black Box of Machine Learning: Challenges and Opportunities||Oct 08, 2021|
This survey reviews the existing applications of the power watershed (PW) optimization framework in the context of image processing. In literature, it is known that PW framework when applied to some well-known graph-based image segmentation and filtering algorithms such as random walker, isoperimetric partitioning, ratio-cut clustering, multi-cut and shortest path filters yield faster yet consistent solutions. In this talk, the intuition behind the working of PW framework i.e. exploitation of contrast invariance on image data is explained. The intuitions are illustrated with fictitious images and experiments. The ideas from the framework can be adapted to a large class of graph-based cost minimization methods, and can be combined with deep learning networks, leading to scalable algorithms that attain high quality image segmentation and filtering results.
In this work, a hierarchical graph partitioning based on optimum cuts in graphs is proposed for unsupervised image segmentation, that can be tailored to the target group of objects, according to their boundary polarity, by extending Oriented Image Foresting Transform (OIFT). The proposed method, named UOIFT, theoretically encompasses as a particular case the single-linkage algorithm by minimum spanning tree (MST) and gives superior segmentation results compared to other approaches commonly used in the literature, usually requiring a lower number of image partitions to accurately isolate the desired regions of interest with known polarity. The method is supported by new theoretical results involving the usage of non-monotonic-incremental cost functions in directed graphs and exploits the local contrast of image regions, being robust in relation to illumination variations and inhomogeneity effects. UOIFT is demonstrated using a region adjacency graph of superpixels in medical and natural images.
The adaptive processing of graph data is a long-standing research topic which has been lately consolidated as a theme of major interest in the deep learning community. The snap increase in the amount and breadth of related research has come at the price of little systematization of knowledge and attention to earlier literature. This work is designed as a tutorial introduction to the field of deep learning for graphs. It favours a consistent and progressive introduction of the main concepts and architectural aspects over an exposition of the most recent literature, for which the reader is referred to available surveys. The paper takes a top-down view to the problem, introducing a generalized formulation of graph representation learning based on a local and iterative approach to structured information processing. It introduces the basic building blocks that can be combined to design novel and effective neural models for graphs. The methodological exposition is complemented by a discussion of interesting research challenges and applications in the field.
Higra — Hierarchical Graph Analysis is a C++/Python library for efficient sparse graph analysis with a special focus on hierarchical methods capable of handling large amount of data. The main aspects of hierarchical graph analysis addressed in Higra are the construction of hierarchical representations (agglomerative clustering, mathematical morphology hierarchies, etc.), the analysis and processing of such representations (filtering, clustering, characterization, etc.), and their assessment. Higra targets a large audience, from students and practitioners wanting an accessible library for quickly experimenting, to researchers developing new methods for hierarchical analysis of graph data. Higra is a generic toolbox for graph analysis and can be utilized in a large variety of application fields like machine learning, data science, pattern analysis and computer vision. Moreover, it contains an image analysis module easing the handling of pixel grid graphs by providing efficient algorithms dedicated to this field.
This talk will discuss the interest of hierarchical region-based representations of images such as maxtree, mintree and Binary Partition Trees for radar images. These representations can be considered as an initial abstraction from the signal in which raw pixels are grouped to form regions which are hierarchically structured by inclusion in a tree. They provide multiple resolutions of description and easy access to subsets of regions. This approach and the associated notions will be discussed for both maxtree description of Synthetic Aperture Radar (SAR) image and for Binary Partition Tree for Polarimetric SAR images. Once constructed, these hierarchical representations can be used for many applications including filtering, segmentation, classification and object detection. Many processing strategies consist in populating the tree with features of interest for the application and in applying a specific graph-cut called pruning. These pruning ideas will be illustrated in particular for polarimetric SAR image segmentation and speckle reduction. The tree representation itself is a specific graph structure. As a result, an alternative processing strategy consists in populating the tree with attributes but considering the resulting data as graph attribute signals which can be processed with graph filters. The goal of this filtering step is to exploit the correlation existing between attribute values on neighboring tree nodes. Considering that trees are specific graphs where the connectivity towards ancestors and descendants may have a different meaning, several filtering strategies can be defined. Beside classical Graph filters, two new filtering notions can be used: Tree and Branch filters. These ideas will be illustrated in the context of ship detection in SAR images.
Automatically describing a video with natural language has recently gained attention as an essential high-level task of video understanding topic. Video captioning/description is the task of predicting a semantic and syntactically correct sequence of words given some context video. The most successful video captioning methods have several limitations, such as the existence of gaps in semantic representations of the generated captions. Despite these gaps, the models have a strong dependency on the effectiveness of semantic representations learned from visual models, but often produce syntactically incorrect sentences, which harms their performance on standard datasets. This Ph.D. research addresses these limitations proposing new forms of dealing with the Video Captioning task and posing as a hypothesis that cues about the semantics and syntax of the video’s descriptions can be directly extracted from the videos. In the early stages of this Ph.D. work, we have proposed a video captioning model that can selectively decide when to use visual or semantic information into the text generation process. Subsequently, we decided to increase the capacity of the model, incorporating other types of information. We proposed a Visual-Semantic-Syntactic Aligned Network, which guides the text generation stage (decoder) by aligning temporal compositions of visual, semantic, and syntactic representations. We tested both architectures obtaining state-of-the-art results on two widely used video captioning datasets: the MSVD dataset and MSR-VTT dataset.
Recent developments in X-ray microscopy imaging are giving researchers from all fields of knowledge the ability to analyze materials with unprecedented resolution, both in space and time. At the forefront of such techniques, synchrotron light sources are pushing the limits of image acquisition, producing time-resolved and/or hyperspectral volumes with Terabyte sizes in just a few minutes. Visualizating, processing, and segmenting those data are major challenges, requiring specialized High Performance Computing algorithms capable of aiding researchers in their analyses. In this talk, we will be presenting an integrated framework being developed at the Sirius synchrotron light source in Brazil to address those challenges, in which beamline users are able to interact with the acquired volumes in near real-time, with little to no expertise requirements in HPC, Machine Learning, or Mathematical Morphology.
Impresee is a SaaS startup offering services for ecommerce based on Artificial Intelligence and Computer Vision. In this talk I will review some technical details and commercial issues that have arisen during its 6 years of development.
Machine learning (ML) has witnessed tremendous successes in the last decade in classification, regression, and prediction tasks. However, many ML models are used, and sometimes even designed, as black boxes. When such models do not operate properly, their creators do not often know what is the best way to improve them. Moreover, even when operating successfully, users often require to understand how and why they take certain decisions to gain trust therein. We present how visualization and visual analytics helps towards explaining (and improving) ML models. These cover tasks such as understanding high-dimensional datasets; understanding unit specialization during the training of deep learning models; exploring how training samples determine the shape of classification decision boundaries; and helping users annotating samples in semi-supervised active learning scenarios.
The IMScience Tech Talks is a web seminar completely free of charge, and it is organised by the IMScience research group of the Graduate Program in Informatics at Pontifical Catholic University of Minas Gerais. If you have any questions regarding IMScience Tech Talks, please, don’t hesitate to contact us. Follow our Instagram.