Introspection on a decade of coding
Published:
Autobiography of a nerd
A few days ago Youtube recommended me a video called 10 Years Of Coding - Everything I’ve Ever Done. “Wow” did I think, “10 years is such a long time, I’ve only been coding for like 3… 5… 8… 11 years?”, realizing I had actually started programming more than a decade ago too. It was a long and varied period, so let’s have a look back at all these years spent yelling at computers designing such insightful programs.
A whole world inside a calculator
This journey started in 2011 when a friend showed me he had made a simple slot machine on his calculator: you bet some points, 3 dices are thrown, you get a jackpot if 2 or 3 are identical and then win more points, if all 3 are different you lose your bet. Simple, yet so effective. He gave me the whole code handwritten on a paper sheet, which I spent a weekend copying into my own calculator, looking for function names in menus and trying to understand what each part meant. I then started modifying it, since it was easily abusable: surprisingly betting a negative amount is actually a good idea when you have a high chance of losing.
That first language I had discovered was called TI-Basic, but I had no idea other kinds of similar languages existed for computers: they are so much more complex than calculators, they have graphical interfaces, so it surely must be very different to program.
For the next years I then went on to develop other games while bored during classes or on holidays, from a tic-tac-toe game where actions would be played played with the numerical pad to a game of pendu where you had to guess a word, or even a program that just randomly sampled a joke in a list and displayed it. This introduced me to the main concepts of variables, conditions, and loops, making our first classes of programming with pseudocode a lot simpler. Since programs could be shared with other calculators, some of these games ended up being played by a few people in my class and close ones.
Wait, computers are not magical?
Since our first programming classes had showed me coding on a computer was actually a lot simpler than expected, I decided in 2014 to learn the C language from a book, which was just a printed version of a french website with coding lessons, in the summer break before joining my classe prépa. Just like on calculator I learnt by making games: the same slot machine, then a stock trading simulator, etc.
For the next two years I joined the Jean-Baptiste Say classe prépa, spending most of my time studying to prepare a competitive exam which greatly reduced free time for side projects. I had a first introduction to Python in class, although I struggled at it since it was targeting “useful” cases such as mathematical applications instead of game development. Programming is too important to be left to the useful cases, as Clemenceau would have probably not said.
One tool I however got the chance to discover during this time however was Twine, an editor to create stories with multiple choice,by using blocks linking to each other. Although each block can contain just raw text and links to other blocks to advance the story, Twine also supports a variant of JavaScript for global variables assignment and conditions to give more options for stories: if you want to free the princess in a fairy tail, you need the dragon_alive
variable to be set to False
. With some extrapolation and a few loops this can be used to store combat stats and item inventories for RPGs, which led to the development of a few interesting stories and games although I never got to actually complete them due to the lack of time to dedicate to it.
Somewhere in these two years I also discovered genetic algorithms and bio-inspired programs, which stayed as a side topic of interest for the next few years.
2016-2018: Supaero
In 2016 I ranked top 30 nationally at the exam and joined Supaero in the general engineering program. Along with classes in the main computer science course allowing me to learn C and Java in a more academical setting, I was able to choose a set of modules to discover new paradigms and applications:
- Web application development with html, css, some JS and SQL databases
- VHDL programming for FPGA cards, where all instructions in the program are synchronous by default
- Functional programming with Scala
- Autonomous systems and robotics
Student clubs also gave me the opportunity to develop new sets of skills, from the Robotics club to, more surprisingly, the Junior Company where I developed tools in VBA for Excel to keep track of KPIs and for management purposes.
My interest in genetic algorithms shined twice. First, in our first year project I tried to apply them to optimize decision-making in the autonomous robot of the robotics club, including a first attempt at optimizing state machines for something that resembled some very limited reinforcement learning.
Then, since the second year Java project was an artificial life simulation with preys and predators which proved to be quite unstable (preys or predators would all disappear after a while) I optimized the hyperparameters of the simulation such as reproduction and hunting ranges with a genetic algorithm to maximize lifetime of the system, yielding a stable simulation. Predators ended up being slow and hunting from afar while preys were quick, leading to a weird paragraph in my report explaining it actually made sense if they were chameleons and fireflies.
Introduction to AI
One of the modules I chose to take at Supaero was called “Introduction to AI through game programming”, which covered an introduction to good old-fashion AI methods for adversarial games, such as the min-max algorithm and its alpha-beta variants, hash tables and Monte-Carlo Tree Search.
Although we could choose any 2-players game to build an AI for, I decided to code my own RPG combat system for some reason, and then build an AI to play it. Since it was coded in standard C with some object-oriented ideas, it quickly got out of hands and, surprisingly, they limited the games to a pre-defined subset for the next year.
One of the professors teaching this class suggested that 4 of us go to the Autumn Institute for AI organized by the AI research group of CNRS as a week of talks on AI, this first session being more specifically on AI and games. Throughout the week we got conferences on many approaches for AI by research scientists from CNRS and other labs, ranging from game theory to deep reinforcement learning like AlphaGo. Although it was rather made for PhD students than for masters students like us, this gave me a good introduction to a wide range of research fields and I spent the next few days and weeks reading about neural networks. As I was struggling to implement the backpropagation algorithm in my first neural network - coded from scratch in Java - I instead used, again, a genetic algorithm to optimize it. That was, unknowingly, my first step into neuroevolution.
Finally for the second year research project at Supaero I got the chance to be supervised by Caroline Chanel and Nicolas Drougard on a project aiming at using deep RL (DQN) to learn which information to send during human-machine interactions where the human was under high levels of stress, reducing their capacity to multitask or focus.
Gap year and machine learning
Like most student at Supaero I chose to take a gap year between my second and last years of engineering (equivalent of between the first and second year of MSc), dedicated to two six-months internships in order to get to know actual work environments and to decide what to specialize in for my last year.
For my first internship I took the role of CEO’s right hand at Pricemoov, a startup offering dynamic pricing as a service based on data and AI. Although my role was mostly administrative and organizational, working in a small team allowed me to see how and end-to-end process using data and machine learning can be implemented, including relationships with clients and real-world application issues. I transferred my experience on VBA to Google Script, the language used for macros in Google Sheet, and developed tools for internal management.
My second internship took place at the consulting firm Wavestone as a cybersecurity consultant. While half of my time was dedicated to consulting missions for clients, I also had a project on analyzing the potential for AI in cybersecurity and developing a proof of concept of anomaly detection based on the firm’s own security logs. This project gave me a first hands-on approach of classical machine learning algorithms and data processing in Python. Studying cybersecurity also pushed me to start looking at how pieces of software interact and what lower-level parts are actually made of.
During this gap year I had multiple time periods without any work-related coding missions, which led me to start multiple side projects in Python including a predictive model for Women’s Football World Cup, a base of graph-based data transformation tool generating SQL requests, or an autoML tool.
Understanding what I’m doing
These experiences convinced me to take a major in Data and Decision Sciences at Supaero for the last year, with a minor in Robotics.
The major gave me an academical foundation to data science and machine learning tools I was already using and a few notions in the R language, although being already comfortable with Python I decided to keep using it to get a deeper understanding of its inner workings.
One project was a notebook explaining a ML research paper to other students, for which I managed to get the Multidimensional Genetic Programming for Multiclass Classification paper by La Cava & al., guided by my interest in evolutionary methods.
Finally for that year’s research project I joined a team looking at the use of deep learning and deep RL techniques for combinatorial optimization, to learn from past instances how to solve expensive optimization problems.
Although it was a pretty dense year, I still somehow found some time to dibble in a few side projects such as representing hexagonal grids and implementing pathfinders for game design purposes, trying to predict esports game results with machine learning and elo, or exploring data visualization from the League of Legends data APIs.
My biggest side project was however called Genepy, an artificial life simulation in which animals would interact, mate and take decisions with brains that would evolve with them. To add this possibility, I implemented the core idea of the NEAT paper, in order to have simple neural networks that could be evolved. Realizing I was reading and implementing a ML research paper at night during my short winter holidays for a side project was also probably the deciding factor in choosing to pursue a PhD in machine learning.
The name Genepy comes from the Python suffix “-py” on the French word “génépi”, a local alcohol from the Savoie region which is where I implemented it during the Christmas break.
Covid19 and free time
My last year at Supaero gor interrupted by the first lockdown caused by Covid-19 in early 2020. With classes ending and my internship in Montreal cancelled, I suddenly ended up with a lot of free time that couldn’t be spent outside and hence decided to start developing a game I had been designing for a while. Since it was my main language I first tried Python, but updating multiple parts of the user interface asynchronously proved to be too challenging and the options to export it in a platform-independent format were limited. After a Python prototype I then switched to Godot, an open-source game development software with many UI management tools and its own language called Godot Script that can execute game mechanics synchronously. Godot also features multiple possibilities to export games, including web-based versions easy to share. Although it allowed me to develop a simple version of the game I had in mind, I had to stop it to start a research internship with Dennis Wilson at Supaero on the evolution of neural networks, finally going back to my long-lasting interest on evolutionary methods as a hobby.
Although the internship took much of my time, not being able to leave home due to the lockdown meant I still had time for lighter side projects, such as a data visualization for an esports Fantasy game, in order to choose the best combination of cards to play for each day based on history. It relied on Python for web scraping, Dataiku (which I had discovered at Pricemoov) for data processing and Tableau Software for the visualization. After sharing it in a small Discord server it got used [CHECK NUMBER OF TIMES] times, but unfortunately the code was lost in a corruption issue of the virtual machine running the Dataiku server with all scripts.
Groinkbot
From a simple chatbot…
As my internship finished at the end of November and my PhD started mid January I had some time to kill at the end of 2020, although not many opportunities to travel or even go outside since France was going through its second lockdown. Since I had been promoted moderator on a Twitch stream and often had to spam the same few commands I developed a simple Twitch chatbot running on a Raspberry Pi and sending specific messages when a button was pressed. This made moderating faster than typing, but limited to a few pre-defined commands and still reliant on human action, so I started mapping some answers to specific messages to automatically reply. As my in-game name had been Groink for a few years, that bot got simply named Groinkbot.
As the number of commands started expanding and I wanted more options I started a completely new version of GroinkBot with a modular architecture, and compatible with Twitch chat, Discord servers and potentially other platforms. Groinkbot relies on modules of commands defined in Python files which are imported by the main loop and used when parsing incoming messages in chat. I added a feature to re-import code modules while the bot was running, allowing to work on the bot while it was running and to deploy new features without killing the process. It also takes into account permission levels to allow for some commands to only be triggered by moderators, limiting spam possibilities from random accounts.
Groinkbot also featured a control page served on a web server also running on the Raspberry Pi, to start and stop the bot without having to SSH onto the Pi. However since a Python process could only control 1 chatbot, running multiple ones in parallel had many side effects, leading to a third version which kept most of the features of the second one for the chatbot part but relied on a separate framework for the management, simply called GroinkApp.
… to a full application manager
GroinkApp is a framework to manage multiple applications in parallel with a unified control interface, itself hosted as an application by the manager. Apps can range from Twitch chatbots to temperature sensors, including web servers or SQL databases accessible through web APIs, and they all run in separate threads. Since each chatbot is now a different app, they can be run in parallel threads without side effects and managed from the same control page.
This project was a great opportunity to work both on application architecture with the modular design of the chatbot and of the manager, on user interaction with commands, on web frontend development with HTML, CSS, and embedded JavaScript in the control page, and on web backend with Flask and some dynamic url assignment.
Feedback
Groinkbot has now been running in the Twitch chat of a stream with a few hundred viewers for more than a year, interacting with users through commands and trigger words. It can now sing song lyrics in chat, live-translate into English any foreign message, react to specific words by specific people for private jokes or organize games with humans in the chat. Its modularity makes it extremely easy to add functions, and projects with ask/tell interfaces can be adapted in a few minutes to work in the bot.
This development and improvement could not have been possible without the feedback of people interacting in the chat, showing its true potential and limitations and giving me the drive to keep working on it. GroinkBot is one of my very few projects that went into a “production” state, i.e. being used by outside people on a daily basis with good adoption and great interactions, and it was thanks to these people that it got to that point.
Internship
- GA
- Neuroevo
- Dota
- Julia
- GENE
PhD
- BERL
- DQNES (incl DQN, Rainbow)
- automation: bash
- slurm
- MPI
A word of conclusion
What I learnt
After that decade of developing for school projects, internship missions and (mostly) side projects, I’ve had the opportunity to come across quite a wide variety of problems in different conditions, although they probably only represent a tiny fraction of the range of problems that I could encounter and that I look forward to exploring in the future.
Even though I haven’t been able to keep professional capabilities in most of the 15 or 17 programming languages I’ve encountered, they taught me how to quickly learn how to use new languages by exploring documentation and tools, by trial and error, and by exploiting similarities in previously seen languages: VBA and Google Script do the same thing, VHDL and Godot Script both run all expressions synchronously, and C pointers help understand how Python works under the hood. Understanding how each language reacts and works fed my comprehension of computer science more globally and gave me more tools to tackle new problems.
- game dev => top-to-bottom learning: clear goal, learn more as problems happen
- curiosity: many topics
- regret: not finishing many projects (Genepy, Godot, Twine)
Engineer or Research Scientist?
both:
- trained as engineer
- curious like scientist