Image for post
Image for post
Image taken from Pexels

A comparison between Keras’ ImageDataGenerator, TensorFlow’s image_dataset_from_directory and various tf.data.Dataset pipelines

When we start learning how to build deep neural networks with Keras, the first method we use to input data is simply loading it into NumPy arrays. At some point, especially when working with images, the data is too large to fit in memory so we need an alternative to arrays. From my experience, the go-to solution to that problem is to use the tool built into Keras called ImageDataGenerator. This creates a Python generator that feeds the data gradually to the neural network without keeping it into memory. …


Image for post
Image for post
A librarian’s job is pretty hard so I am trying to use AI to simplify it! Source: pixabay

Collect data, train a model and deploy it using popular data science libraries

Many websites nowadays use at least some kind of recommendation system to guide their customers towards interesting products and make them buy more. The most advanced ones (like the ones used by Amazon, Netflix or Youtube) can be very complicated, but it is surprisingly simple to make a decent recommendation engine from start to finish for small scale applications. This is what I want to show in this article.


Image for post
Image for post
I am taking part in the NHL pool organized by the Canadian Blind Hockey Association.

Machine Learning

How I used machine learning to predict the number of points that a hockey player will score this year

With the new National Hockey League (NHL) season just around the corner, I received an unexpected invitation to participate in a pool with people from the blind hockey community. The idea is to pick a player in each of the 24 groups of six similar players to form your team. The winner is the person whose team gets the most points during the season. After trying to choose players for a couple of minutes, I started to wonder how hard it would be to train a machine learning model that helps me in this task. …


Image for post
Image for post
Based on images found on pixabay

Train your own computer vision models in a fraction of the time with Decathlon’s python package

At Décathlon Canada, one of the main tasks of the AI team is to extract meaningful information from images. In order to do so, we have spent a fair amount of time on building a solid pipeline to train image classification models with python, using Tensorflow. My colleague Samuel already wrote a wonderful series of articles (see parts 1, 2 and 3) about the algorithms that are used in our pipeline. To summarize, we combine the power of data augmentation, transfer learning and hyperparameter optimization to get the best models possible.

To make this pipeline as easy to use and…


How Decathlon capitalizes on social listening

One of the strengths of the Artificial Intelligence team at Decathlon Canada is extracting information from images. In the past we have spent a good amount of time building algorithms to analyse sports images (see, for example, parts one, two and three of our image classification story). This has lead us to release the Sport Vision API, which can be used to extract the following data from an image:

  • If a sport is displayed or not
  • Sport practiced
  • Location (indoor/outdoor)
  • Sport equipments displayed and their color attributes
Image for post
Image for post

Of course all this information is cool to have, but the question is…


Image for post
Image for post

How to save some time managing your images when working on image classification.

If you spend enough time working on computer vision using deep learning, you’ll know that there are some code snippets that come back again and again when managing your images. If you are like me and you are tired of always writing python code to go through a list of images separated into categories in a dataset or building a NumPy array from your images, this post will be useful to you. I will also explain how to build a module that can be imported from any script on your machine with your most used functions.

Adding a module to PYTHONPATH

Before describing the functions…


Image for post
Image for post

How I became a data scientist with a PhD in theoretical physics

There are a lot of stories out there of people who explain how they became data scientists but I found it hard when I did the transition to find one that corresponded to my profile. This is why I want to share my journey just in case it could inspire someone else going through the same thing. I decided to pursue a career in data science a little more than a year ago (during the fall of 2018) and I am now a data scientist at Décathlon Canada so I achieved my goal. I will describe my journey as an…


Image for post
Image for post

How to improve your deep learning dataset with realistic images.

A deep learning model is only as good as the data you feed it. This is why it is very important to spend enough time gathering a large amount of good data. What I mean by good varies according to the problem at hand. A useful trick to determine if your data is appropriate is to think about what the users’ inputs will look like. This is an important issue, especially when working with images. Lighting, contrasts, orientation, image quality and perspective can vary so much and not taking this variation into account can create huge errors in prediction. For…


Image for post
Image for post

Machine Learning

Using the flask library to deploy a language identifier Keras model into a web app and URL based API.

Training a neural network to achieve a specific task is pretty fun and interesting, but the work doesn’t stop when you are happy with the model’s performance. It is useful to be able to share the model that you built with other people who may want to use it. This is important both for personal projects and for industry work. This is why I wanted to learn how to deploy the models that I built to identify the language that a word is written in. …


Image for post
Image for post

An example of character level word generation

A while ago when I was finishing the Deep Learning specializaion on Coursera I decided to try to implement some things that I learned, As everyone who took theses courses know the theory is incredible but the practice is pretty much just a pre-made code with some blanks to fill. I think it is not a very good way to learn so I wanted to code it all from scratch. The example that attracted me the most was generating words that look like the data the model is given. The course used dinosaur names and I decided to use Pokémon…

Yan Gobeil

I am a data scientist at Décathlon Canada working on generating intelligence from sports images. I aim to learn as much as I can about AI and programming.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store