Lately, I've read and heard different discussions on this topic. Many people with different education believe what robots possible in general from the beginning of the previous century. So why do computer systems are still far from seemingly simple functions that successfully perform out not only people, but also much simpler insects? The theory says that consciousness builds the models of the external world inside with which consciousness works and on the basis of which makes the decisions. But in order to compare the world with inner models of consciousness, you must first recognize the images of the external world in order to bring the model inside consciousness in accordance with them. In the opposite case, the model is incomplete and thus - useless. For example: if a bee cannot recognize the flower, then bee will not be able to collect and nectar.
The key idea of a pattern recognition approach is to develop a computer system (CS) ability to adapt to the changes in environment to perform the tasks. You can do this by creating a basic set of images of objects and actions in the world and a set of possible responses to them. Based on these sets of CS can build your own strategy or adapt of the existing one. Then CS can use it to perform tasks under the circumstances.
This defines the key problem that blocks the possibility of further development of the CS in the field of adaptive systems and robotics.
Pattern recognition is necessary for a wide range of obvious tasks such as facial recognition or other objects on photo, speech recognition and text, but also for not such obvious tasks as forecasting of peak load in mains or behavior of different prices or rates by pattern recognition of some factors which must precede the predicted events.
I managed to create the practical implementation of the universal recognition algorithm for the CS.
A system of algorithms is based on the use of classical the artificial neural network (ANN) - perceptron Mathematical and other details can be found by link above. I will try to describe this method by analogy, thus this will be obvious to anyone.
Back propagation The ANN training method offers the opportunity for a set of samples, such as the vector of brightness of image points with the values of brightness as an entrance sample to ANN. Number corresponding to the image - the reference value that we need to get the output - signal that corresponds to the meaning of the image. As sample - 1 if the sought pattern recognized and 0 otherwise.
For a training sample to back propagation, ideally, should consist of a big amount of samples to train the ANN with them as accurately as possible to cope with the pattern recognition of real samples with which the ANN will face in real environment. But there are some cons and limitations of this approach, which lead to the impossibility of using of this method for a big sample size and for high complexity of the images.
This method has been successfully used for recognition of individual letters and numbers in the image, but for more complex tasks did not work because of method limitations.
For a better understanding of the mathematical essence of the method without the use of formulas (formulas, if desired, you can see from the links above), we represent the space of solutions of the problem in a shape of the graph:
Under the right solution we'll understand the minimum point on the graph. Method of Back Propagation which is a modification of the classical gradient descent method of optimization can be compared to the force of gravity on the blue metal ball. Thus, the strength of the gradient descent, by analogy with gravity causes the ball that appears anywhere on the surface of the graph, move to the minimum point on the graph to give the right solution, as shown schematically in figure above.
But what if the surface of the graph is more complicated, for example, is flat or has potholes which not allows to a ball to roll down to the bottom of the minimum? Or if we have several such minimums, but only one of them, the most profound is the right decision?
This is a limitation of back propagation that able to train the perceptron only for very simple pattern recognition on the images. These patterns don’t have to have in the space of solution sets of local minima (several dimples) or flat surfaces.
In practice, because of the limitations of this method, it is unable to work even on fairly simple examples. For example, when trying to face recognition close-up 24 of 24 point black and white image, this method did not work for the training sample of just two images: one image with the face and another without a face. So that the use of this method for something more complex than identifying a small set of simple symbols on a homogeneous background is not possible.
But what if we try to use several ANN instead of one with different initial weights of synapses?
This will help in some cases, a little harder than base back propagation, which is shown in the figure above, but what if the dimples or flat places will be a lot of and will contain more irregularities on the hillside, which will not allow even if the ball fallen down next to correct answer to reach that correct answer. Then the effectiveness of this approach becomes minimal.
When testing, this modification stopped working only with 4 samples - 2 images with and 2 without faces in the example which will be detailed described further. So that this modification can improve the performance of the gradient descent, but not enough to come close to the specialized methods of pattern recognition which currently exist, for example - on the basis of the Haar signs.
The algorithms presented in the new recognition technology are universal, suitable to solve various problems related to pattern recognition and adaptive control. Below is a description of the technology that includes possible areas and borders of application, different combinations of algorithms and key values of their parameters.
The key idea of the technology of creation of trained ANN for solving of pattern recognition problems is in the using of directional search using the representation of the structure of ANN and its content in the form of a sequence of bits (serialized representation or imprint), on which we perform the operations, which allow to transform the search for solutions from completely random to directional. The root point here is the possibility of a direct conversion of the ANN in imprint (serialization) and the backward out of imprint to the ANN (deserialization).
The basic structure of the main algorithm of technology is in using an iterative approach for creation of trained ANN. But at the same time basic set of actions of the algorithm consists of several transformations by which we can achieve in the result a set of ANN of a new iteration which with a high probability will better than the ANN in the current iteration set. Here are these two basic operations on an imprints ANN:
1. Combining of two or more imprints of ANN with inheritance of matched characteristics. The general principle of operation - sequential comparing of 2 or more imprints of ANN in the same position, if the majority of bits of all involved in the operation imprints of ANN in this position have the same value, then the output bit for new ANN in the same position will filled with the value of the majority. In the opposite case, the corresponding bit of the result will filled with a random value.
2. Random modification of the imprint. Probability of accidental modification (Pmod) can be set in the configuration or selected as a function of the quantitative estimates of the INS. As an example the implementation of such an operation can be a sequential brute of each bit of the imprint and generation of random value P for it, such as in the range [0, 1], if the resulting value of P
The first step allows you to narrow the search space to a distance between the best (lowest of all) ANN iteration. That is, the ANN for a new iteration of the resulting combination will appear above the graph surface within the resulting interval. Second operation allows you to extend this space to search for the next iteration to allow cover not only the space between the best of the current iteration of the ANN, but also the neighboring areas, reducing the possibility of looping algorithm in local minima (not the deepest pits) as shown in figure:
In practice, it is more effective to use is not only use two of the best ANN imprints (in the figure - blue balls) ANN but more, as well as the larger size of the iteration (shown ANN iteration - all "balls").
Proved to be effective methods of increasing diversity by adding a set of random imprints ANN to the new iteration. At the same time further increases the efficiency of the use of limiting the maximum "height" for new imprints of ANN by the average value of "height" of ANNs of the current iteration.
It is important not to lose imprints of an ANN with good results, which have achieved a high level of compliance with training sample.
Modifications of basic algorithms for different needs:
Clustered algorithm – except a correct value which close to the target values in the samples for each ANN from the cluster, allowed values of the output which can consider as "not sure". Then the total conformity assessment will be assigned to the entire cluster (set of several ANN), in which for each input vector of training sample exists at least one ANN which makes the result is sufficiently close to the target value, and there is no one ANN in the cluster, which gives false results.
To check the pattern recognition technology described above, I used the training set of 80 samples of monochrome images 24 by 24 pixels, 40 of them contains the faces of people and in the remaining 40 - no faces, but there are sophisticated noising, various objects which are not the people's faces. ANN output has to be close to 1 for the samples with the faces and 0 otherwise.
The result was next - pure Back propagation method ceased to produce results even at two images with faces and two without.
Basic algorithm without any improvements remained true up to 20 images and that's good, because even back propagation with randomized feature was unable to handle more than with 10 images.
But the greatest result shows a clustered modification of the basic algorithm. All of 80 images were recognized with good precision by the cluster of 8 ANN.
If you have any questions about pattern recognition with ANN or you have a question about ways of efficient implementation of pattern recognition for your business, please feel free to contact me via email - novikovstepan@gmail.com or Skype - novikovsteve.