Python modules for machine learning/reinforcement learning


#1

Hello all,

Sorry if this has already been adressed, I'm a bit of a newbie here.

I've seen a couple topics in the past to request more available modules in Python 2&3, and I'd like to add a few more to the wishlist:
- sklearn, the most commonly used machine learning module
- keras+theano, a very popular pair of modules to train & use neural networks

I don't know if many people here use these kind of methods, but I think reinforcement learning could be a viable path in some of the games available.

Thank you.

(thank you also for adding numpy last year, staying below 50ms in GitC would have been a nightmare without it :slight_smile: )


#2

I use those libraries on a weekly basis and they're indeed awesome (especially Keras), however aside from Numpy / Scipy I don't think they're necessary.

  • Running those on CG in the runtime allocated would be difficult (loading Theano + Keras + serialized model and weights would take much more than that)

  • Other languages would require the same treatment: Breeze for Scala, D4J for Java, ... I think I've heard that even C/C++ doesn't have all the possibles compiler optimizations enabled. And that would be a lot of work to maintain

  • I've been on CG for only a few months and I've reimplemented from scratch more algorithms than I've ever done in the past. It felt great and I learned a lot. I believe that is a core aspect of CG, contrary like Kaggle for example which is much more focused on the competition.

You might already know about those :slight_smile:

http://iamtrask.github.io/2015/07/12/basic-python-network/

which are great examples of that.


Neural Network Ressources
#3

Thank you for your reply, you raise many good points.

Running those on CG in the runtime allocated would be difficult (loading Theano + Keras + serialized model and weights would take much more than that)

If we are to talk purely about the technical aspect, I think that is very managable in 1 second.
I'm not necessarily talking about deep learning, small to medium neural networks and things like decision forests would be fast to load.

Still I'm not saying that would be an easy task, maybe you'd have to base64 encode and zip your models to fit in the 100k characters limit, but that's where the fun begins :slight_smile:

Other languages would require the same treatment: Breeze for Scala, D4J for Java, ... I think I've heard that even C/C++ doesn't have all the possibles compiler optimizations enabled. And that would be a lot of work to maintain

Agreed.
I'd just like to point out that if one of the goals of CG is to help people learn new languages, one could argue that modules are at the center's of Python's philosophy. So it would make sense to have the most mainstream modules available. (I mean, you have to know that one: https://xkcd.com/353/)

I've been on CG for only a few months and I've reimplemented from scratch more algorithms than I've ever done in the past. It felt great and I learned a lot. I believe that is a core aspect of CG, contrary like Kaggle for example which is much more focused on the competition.

I partially agree with that.

As much as this is true for most algorithms (forests are espacially fun to implement), is it reasonable to ask people to re-write their version of keras? (I mean, I will if I have to, but I'm not looking forward to it :stuck_out_tongue: )

Anyway, if this is CG's philosophy, I'm ok with it.

Thank you for the Q-learning links, it looks like a very intersting approach :slight_smile:


#4

Actually it's possible to use tensorflow on CG, but only in Python (not Python3).
Probably because CG's Python environment is always the same and has to be compatible with the machine learning puzzle.
It would be cool if we could upload a tensorflow session file to use a model locally trained.


#5

Oh that's great, I didn't know :slight_smile:

There are many ways to do that, you could pickle a tensorflow object and past it in your code. Or just dump the parameters.


#6

Sklearn would be great!


#7

The technology's main purpose is to develop the algorithms for machine learning with the help of which you can achieve automatic data processing. So, instead of writing additional code, developers may put the data inside of an algorithm that is working on its self-perfection all the time by searching for data patterns.


#9

Can you give an example with detailed explanation on how exactly do this or link to a good resource, please. I could not find any resources.


#10

Sklearn would be a great addition. I don’t think we shouldn’t have to recreate everything from scratch.


#11

The sklearn models can be created off-line and packaged with pickle to be incorporated into the game.