Neural Network Ressources


#21

Great news ! Waiting for you to come back then ! :slight_smile:

If by any chance you have a linux system, I encourage you to use Agade's Runner Arena : https://github.com/Agade09/CSB-Runner-Arena. It's a great way to monitor progress for your Neural Networks.


#22

You could be interested in the article called Importance of Machine Learning Applications in Various Spheres


#26

I made a chatbot using a Neural Network. I basically copy pasted this tutorial and worked from there.

Architecture
It uses a Gated Recurrent Unit (GRU) which is a type of Recurrent Neural Network (RNN) to “eat” the characters in the chat one by one and output what it thinks the next character should be. I trained three versions on the world, fr and ru channels by teaching it to predict the respective chat logs in the format “username:text” (timestamp removed). Then whenever the bot is triggered to speak I feed it “Bot_Name:” and it generates a sentence character by character which I push onto the chat. The network doesn’t actually generate characters but probabilities for the next character, a multinomial distribution is then used to select the next character.

Motivation: Automaton2000
One motivation I had for this project apart from personal interest was to help bring back Automaton2000. Automaton2000 was taking too much RAM on Magus’ cheap dedicated server (1Go RAM). The markov chain of all possible words had exceeded that limit. I made Magus a variable-length Markov chain C++ version (standalone and daemon) that consumes less RAM but it still easily exceeds 1Go. Magus took that work and made it consume even less RAM (github) but I suppose it wasn’t good enough. He also tried to store the tree on the disk instead of RAM but it was too slow. A Neural Network seemed like the ultimate solution against RAM consumption because even large NNs may only take ~100Mo worth of weights (e.g: keras applications networks for image classification). Potentially a NN could also take context into account, like answering people’s names and talking about the subject of the ongoing conversation which a Markov chain cannot.

The training consumes a lot of RAM (GPU VRAM actually…training takes ~2 hours per channel on a 1070) but the inference can be done on CPU and the bot currently consumes roughly ~300Mo of RAM + ~20Mo for each channel. It is a bit slow to generate sentences character by character taking up to a second on my computer and several seconds on Magus’ server. Automaton2000 is currently back using this work.

Issues
One thing that is quite noticeable is the overuse of common words and phrases. One version liked to say “merde” a lot on the french channel. I guess the network just learns what is most common.

Future work (including suggestion from @inoryy)

  • Bug fixes
  • BLEU metric
  • Seq2Seq architecture?
  • Use a higher level NLP framework like Sockeye?
  • Word-level instead of character-level?
  • Somehow ignore “bad data” like things said by Automaton2000?

#27

The C++ version was bugged :frowning: It can’t work like that.

I think i’ll give a last hope shot with a noSQL database.


#29

This isn’t directly related to CG, but the question of getting started with reinforcement learning gets brought up a lot, so figured I’d share my new blog post Deep Reinforcement Learning with TensorFlow 2.0.

In the blog post I give an overview of features in the upcoming major TensorFlow 2.0 release by implementing a popular DRL algorithm (A2C) from scratch. While focus is on TensorFlow, I tried to make the blog post self-contained and accessible with regards to DRL theory, with a birds eye overview of methods involved.