Can a bot learn bridge through experience? GIB programming
#1
Posted 2017-October-29, 08:37
#2
Posted 2017-October-29, 09:47
fstrick604, on 2017-October-29, 08:37, said:
Matt Ginsberg enhanced 50+ year-old AI techniques with his partition-search to create his Gib Bridge program.
AlphaGo is based on 2 more modern neural-nets and trained on top human Go games.
AlphaGo Zero reduced that to 1 neural-net and was taught only the rules of Go. It learned just by playing earlier versions of itself.
Bridge rules and scoring are more complex than Go. Bridge has a large element of chance incluidng bluff and mixed-strategy components. It requires partnership communication and co-operation. To emulate Alphago, Gib would need to be completely rebuilt starting from scratch, using a neural-net, but I think the Deepmind team could manage it, using their TPUs (Tensor-Programming Units). Although the program would probably need more games to train itself.
An interesting by-product might be much better bidding and carding systems.
.
#3
Posted 2017-November-02, 11:12
nige1, on 2017-October-29, 09:47, said:
AlphaGo is based on 2 more modern neural-nets and trained on top human Go games.
AlphaGo zero reduced that to 1 neural-net and was taught only the rules of Go. It learned just by playing earlier versions of itself.
Bridge rules and scoring are more complex than Go. Bridge has a large element of chance incluidng bluff and mixed-strategy components. It requires partnership communication and co-operation. To emulate Alphago, Gib would need to be completely rebuilt starting from scratch, using a neural-net, but I think the Deepmind team could manage it, using their TPUs (Tensor-Programming Units). Although the program would probably need more games to train itself.
An interesting by-product might be much better bidding and carding systems.
.
Thanks for the reply. Certainly it would be much more complex to do. It would have to have a bidding system as it does now. I wonder though if it could develop bidding judgment by playing and learning from it's mistakes. I don't know. It is interesting. I know I learned systems mostly from books, but judgment mostly from years of playing. Not always good judgment either. One thing I have always struggled to overcome is a fear of doubling because I find it so humiliating to have one wrapped. That should be so easy to overcome, but for me it isn't. I think a robot would do better in that area as they would never be ruled by irrational emotions.
#4
Posted 2017-November-02, 11:27
fstrick604, on 2017-November-02, 11:12, said:
Who is going to tell GIBBO it has made a mistake?
vrock
#5
Posted 2017-November-02, 13:19
fstrick604, on 2017-October-29, 08:37, said:
Matt is no longer associated with GIB so the simple answer is no. Could somebody else do the programming? GIB was not designed to "learn" so you would have to throw out all of the old code and start over again, going in a different direction. So a new program would have basically nothing in common with the old GIB.
#6
Posted 2017-November-02, 13:24
(GIB itself was designed to play Moscito Byte, a much simpler system for robots than 2/1, but much more complicated for humans).
#7
Posted 2017-November-03, 08:36
virgosrock, on 2017-November-02, 11:27, said:
Donald Michie's matchbox noughts-and-crosses machine would learn bad-play against really poor players because, against them, bad-play won more games than good-play. Modern AI programs seem to cope with this kind of "plateau" problem.
Bridge is complex. For example, there is the chance element. A bid or play that would be inferior in the long run, can "get lucky" on a particular layout. Neural networks seem to be able to cope with this kind of fuzziness.
#8
Posted 2017-November-03, 10:00
nige1, on 2017-November-03, 08:36, said:
Donald Michie's matchbox noughts-and-crosses machine would learn bad -play against really poor players because, against them, bad-play won more games than good-play. Modern AI programs seem to cope with this kind of "plateau" problem.
Bridge is complex. For example, there is the chance element. A bid or play that would be inferior in the long run, can "get lucky" on a particular layout. Neural networks seem to be able to cope with this kind of fuzziness.
Does GIBBO know about Kelsey's Law of Vacant Places? was it place or space? I forget.
To me after playing Money Bridge on BBO ad nauseum it would seem the solution is simple. "Look at my hand/HCP. If this does not match the Blurb do something else". Not sure if can be implemented in software - should be. This takes care of a lot of problems people are seeing. The Blurb Engine and Bidding Engine don't communicate.
vrock
#9
Posted 2017-November-03, 12:52
virgosrock, on 2017-November-03, 10:00, said:
I'm not sure that GIB understands finesses It was my understanding that GIB always ran simulations (maybe not on opening leads?) to determine the best course of play, so if GIB used the best estimates of the other distributions, the simulations should produce the best percentage plays. To the extent that GIB's descriptions frequently have little correlation with the actual hand, the simulations aren't going to be particularly accurate.
Another problem is that GIB doesn't seem to use enough simulations to correctly model low percentage scenarios (e.g. a 4-0 split) so it won't make a no cost safety play.
#10
Posted 2017-November-03, 18:01
johnu, on 2017-November-03, 12:52, said:
Another problem is that GIB doesn't seem to use enough simulations to correctly model low percentage scenarios (e.g. a 4-0 split) so it won't make a no cost safety play.
People in the know are stating quite aggressively that simulations are done only by advanced GIBBO or in rare cases where the bidding has advanced quite a bit and it has to make "hard" decisions.
It rarely takes finesses. It mostly goes for strip squeeze and endplay type of scenarios. Based on GIBBO's actions, my understanding is GIBBO was designed for endcases.
vrock
#11
Posted 2017-November-03, 18:20