Trying to find object (x,y) coordinates in image, my neural network seems to optimize error without learning -
i generate images of single coin pasted on white background of size 200x200. coin randomly chosen among 8 euro coin images (one each coin) , has :
- random rotation ;
- random size (bewteen fixed bounds) ;
- random position (so coin not cropped).
here 2 examples (center markers added): two dataset examples
i using python + lasagne. feed color image neural network has output layer of 2 linear neurons connected, 1 x , 1 y. targets associated generated coin images coordinates (x,y) of coin center.
i have tried (from using convolutional neural nets detect facial keypoints tutorial):
- a dense layer architecture various number of layers , number of units (500 max) ;
- a convolutional architecture (with 2 dense layers before output) ;
- sum , mean of squared difference objective function ;
- targets coordinates in original range [0,199] or normalized [0,1] ;
- to put dropout layers between each layer, dropout probability of 0.2.
i used simple sgd, tuning learning rate trying have nice decreasing error curve.
i found train network, error decreases until point output center of image. looks output independant of input. it seems network output average of targets give. behaviour looks simple minimization of error since positions of coins uniformly distributed on image. not wanted behaviour.
i have feeling network not learning trying optimize output coordinates minimize mean error against targets. right? how can prevent this? tried remove bias of output neurons because thought maybe i'm modifying thoses bias , others parameters being set 0 didn't work.
is possible neural network alone perform @ task? have read 1 can train net present/not present classification , scan image find possible locations objects. wondered if possible using forward computation of neural net.
question : how can prevent [overfitting without improvement test scores]?
what needs done re-architect neural net. neural net isn't going job @ predicting x , y coordinate. can through create heat map of detects coin, or said way, have turn color picture "coin-here" probability map.
why? neurons have ability used measure probability, not coordinates. neural nets not magic machines sold instead follow program laid out architecture. you'd have lay out pretty fancy architecture have neural net first create internal space representation of coins are, internal representation of center of mass, use center of mass , original image size somehow learn scale x coordinate, repeat whole thing y.
easier, easier, create coin detector convolution converts color image black , white image of probability-a-coin-is-here matrix. use output customer hand written code turns probability matrix x/y coordinate.
question : possible neural network alone perform @ task?
a resounding yes, long set right neural net architecture (like above), but easier implement , faster train if broke task steps , applied neural net coin detection step.
Comments
Post a Comment