A university lecturer decided to broaden his mind in the field of modern neural networks, and gain practical skills in their design. The object of his thesis was an important task for one of the divisions of the metallurgical plant.
The city metallurgical plant produces metal products. One of the intermediate links are steel billets of 300x360 cross section, on the end of which the mechanical method (a branding machine) puts a nine-digit mark.
One of the technological operations is loading these blanks into the furnace for heating before rolling. Loading is performed manually by the operator. Landing depends on the steel mark, which is hammered in the code. The operator must also compare this code with what is implemented in the database for each of the four furnaces for a specific customer. If he suddenly makes a mistake and puts the wrong billet in the furnace, the buyer will get the wrong billet and will charge the company with a fine, which is calculated at ten million rubles.
To help the operator with the purpose of duplicate control, a system is being created that provides automatic identification of the number of the cast blank.
Initially, the recognition task was performed by convolutional neural networks. However, with their help the task could not be solved completely, as recognition accuracy did not rise above 65%.
During the training the student learned about more powerful structures of neural networks capable of solving the task.Throughout the study, together with a team of like-minded people, the problem of determining the type of neural network that most accurately recognized the numbers of the mark was solved. The application of basic network architectures from the Segmentation models library were studied, namely Unet, FPN, Linknet, PSPNet. The resnet34 and seresnet34 networks were used as backbones. It also became possible to understand the installation, training and operation of FasterRCNN.
To create a dataset, a video camera was installed in the factory, which helped to assemble a network training base consisting of 10,000 images.
The study revealed that the best result from the networks of Segmentation models library is a combination of FPN with backbones seresnet34, and this result is comparable in quality with the result of FasterRCNN network.
As a result, in a test set of 1000 FPNs the network gave 90% accuracy, FRCNN - 92%. However, it should be noted that the FasterRCNN network is faster. It takes about 0.2 seconds to produce the result, while FPN spends 1 second on processing.
The plans are to improve recognition accuracy to 99%. There is a hypothesis that the best result will be obtained by combining these two networks.