Erich Henrique

Comprehensive Time Series Exploratory Analysis

Photo by Jason Blackeye on Unsplash Here you are with a dataset indexed by time stamps. Your data might be about storage demand and supply, and you are tasked with predicting the ideal replenishment intervals for a strategic product. Or maybe you need to translate historical sales information into key actionable insights for your team. Perhaps your data is financial, with information about historical interest rates and a selection of stock prices.

Metric Learning for Landmark Image Recognition

Colosseum by Hank Paul on Unsplash Metric learning for instance recognition and information retrieval is a technique that has been widely implemented across multiple fields. It is a concept that is highly relevant to novel applications in research, such as the latest AI breakthrough in biology [2] with AlphaFold [11] by DeepMind, and also mature and well-proven to see vast implementation in the industry, from contextual information retrieval in Google Search [12], to image similarity for face recognition [7], that you might use every day to unlock your phone.

Efficient Labeling Through Representative Samples

On the left: Wildfire — Photo by Mike Newbry. On the center: Tropical Storm — Photo by Jeffrey Grospe. On the right: Pandemic Dashboard — Photo by Martin Sanchez. Original images on Unsplash Cluster analysis as an unsupervised learning technique is widely implemented throughout many fields of data science. When applied to data suited for hierarchical or partitional clustering, it can provide valuable insights into latent groups of the dataset and further improve your understanding of key features that can describe and classify individuals into meaningful clusters for your use case.

Table Data from Images

On the left: Canada Rent Rankings — May 2022. Report summary by Rentals.ca. On the right: Preprocessed image with cluster-defined table layout. A crucial step in document parsing and recognition tasks, extracting table data from image and pdf files has been a widely explored problem with its own challenges. While working on a small personal project, I dived deep into it to discover a wide range of solutions with varying complexity.