» »

IP cameras for numbers recognition: Yes, or not? Expert opinions. Hikvision Automotive Room Definition - Hikvision Ural Camera Reading Car Rooms

22.07.2020

Consider Read more Features IP Camera for reading numbers.

It can be used in order to:

  • automated opening of the barrier at the entrance to the controlled territory;
  • Automated statement of fines in violation by the driver of the rules in the chamber coverage area with recognition car numbers.
  • To automatically calculate the cost of parking, based on the car data.
  • immediate informing about discovery needled carBy comparing its number with the database.

All these analytical processes are performed in the internal software in automatic mode, or with settings and specified functions from the user, through the software installed on the server. Starting the IP camera to recognize car numbers, it is recommended to read the instructions for installing, configuring and operating the instrument. Network chamber of recognition of numbers can have a different form factor and installation type. You should choose based on current requirements and conditions.

We offer to buy IP cameras to recognize car numbers, at a price of 3000 rubles in our online store. The site is available all the necessary information About the device.

Features IP Camera for Car Recognition

Before you buy an IP camera with numbers recognition feature, read its technical characteristics.

List of technical specifications:

  • Power options.
  • Type of software, ease of management.
  • IP Camera Protection Class.
  • The viewing angle.
  • Resolution.
  • Method of installation and connection.
  • Information processing speed, search for matching.
  • Speed \u200b\u200bshooting, record.
  • Temperature mode Camera works.
  • Extremely permissible humidity
  • Manufacturer's brand rating in the market of control systems and video surveillance, user reviews.
  • Dimensions, weight of the device.
  • Equipment, presence of fittings necessary for installation, operating instructions.

The cameras presented in this section were tested by our specialists for compatibility with Macroscop software "Recognition of car numbers". In combination with data on, our cameras will provide you with constant monitoring of the protected area, will help you in the search for the desired machine, automation of a number of processes.

By selecting a device that meets all the requirements, you can quickly checkout the order on the site. We will deliver the camcorder with the recognition of the numbers to the desired object in Moscow as soon as possible.

Added: 2018-02-28 15:24:21

Modern systems Video surveillance is not only a collection of video stream, but also wide possibilities of video analytics.

Such functions as counting the number of visitors, the recognition of individuals, recognition and fixation of car numbers confidently went beyond the interests and jurisdiction of the special services on solving daily business tasks.

Let's stop more detail on one of the demanded functions of the video analysis - recognition of car numbers. Sometimes the video surveillance system is integrated with the access control system: the camera reads the car number, the analytics system checks the resulting image of the number of numbers from the database and when the coincidence is confirmed, sends a member of the car.

Separately, we note that when designing a video surveillance system, it is necessary to divide the tasks of the recognition of car numbers and an overview function (moving techniques and pedestrians, the location of the chambers depending on the conditions of the observed area, etc.). For a camera designed to recognize numbers, there are restrictions on accommodation. In addition, special settings are required. The camera focus should be directed strictly to the area designed to drive cars (in most cases it is 3-4 meters). In this regard, it is recommended to use cameras with a fixed lens. In addition, they usually have best characteristics On photosensitivity compared to motorized lenses.

Camera with what resolution is better to choose?

When solving the designated task of recognizing car numbers, the high resolution of the camcorder may result in worse compared to the calculated one. This is due to the fact that the chambers with an increase in the resolution will deteriorate the photosensitivity, which negatively affects the recognition of the numbers at night.

For calculation, the formula (W / N) * P is used:

where W is the review width in the fixation zone of the number (M),

n - car number size (m),

If we take the width of the viewed section 3 m, the average width of the license plate 0.52 m, and optimal size Images (in practice) 200 pixels, then we obtain the next calculation:

(W / N) * P \u003d (3 / 0.52) * 200 \u003d 1154 pixels.

The calculation shows that it is suitable for a camera with HD resolution (1280 x 720 pixels).

Cameras for recognition system must have certain characteristics.

It should be taken into account the physical size of the matrix. The more the matrix, the more photosensitive. Minimum permissible size Matrixes, to recognize 1/3 inches numbers. The matrices of 1/2 inches in size and higher are most well.

Figure 1. Comparison of images obtained in a dark and bright time of the day from cameras that differ in the size of the matrix

When choosing a camera, you should also take into account the parameter of the lights. It is determined by the choice of lens for the chamber and is indicated in the form of F - the number that is determined by the ratio of the focal length and the size of the diaphragm, Accordingly, above. The image itself will become less digital noise. To recognize numbers, a lens with a luminosity is not less than F / 1.4. The lens with F / 1.3 will be more light.

Note that whatever specifications There was no camera, with the complete absence of lighting, the result in the form of a recognized license plate you will not receive. In this regard, initially consider the possibility of additional lighting. The absolute majority of the chambers have now IR illumination, however, the use of the built-in Illuminated IR means the need to transfer the camera to the black and white mode. In addition, the additional heat released during IR glow may in the summer to become unnecessary, lead to overheating, creating additional interference.

We also pay attention to such a chamber characteristic as the number of frames per second. We, as a manufacturer, recommend a chamber with a frame rate of 25 k / s. However, in practice, on those objects where machines are moving at a low speed, the chambers are transferred to the mode 12 to / s and below, while removing the load from the equipment necessary to process the information array.

As we mentioned above, there are quite rigid boundaries of camcorder accommodation, the way out for them leads to a significant deterioration in the result.

The angle of inclination of the license plate should not exceed 5 ° relative to the X axis in the two-dimensional version of the image.

To capture two lanes, you can place the camera as follows:

Camera should be placed at a height from 2 to 6 meters. When placed on objects having a barrier, it is necessary to consider that the barrier itself forms a certain alienation zone.

Technologies of software recognition of cars and people of people are becoming increasingly in demand. For example, automatic auto recognition can be used as a component of the access control system, for the organization of billing systems paid parking, automation of car pass or to collect statistical information (repeated visits to the TRV or washing, for example). All this is able to modern intellectual software. What is needed to implement such a system? In principle, there are not many - video cameras that meet certain requirements and the corresponding intelligent software module. For example, or more budget

In this article, we will tell you how to choose a digital video camera, able to form a high-quality video, acceptable for software recognition tasks

Resolution

A few years ago, the size of the license plate on the screen was measured in% of the width of the frame. All cameras were analog and their response was the magnitude of constant. Now that matrices may have a resolution from 0.5 to 12), relative values \u200b\u200bdo not apply and the required license plate width is measured in pixels.

As a rule, the specification on the recognition of numbers indicates the requirements for the width of the license plate on the screen sufficient for confident recognition. So, for example, the automatic module of the autotransmisser requires a width of 120 pixels, and the number is 80 pixels. Differences in the requirements are explained as nuances of the work of recognition algorithms and a permissible level of reliability adopted by the developer. Of personal experience It can be noted that the autotransmisir is more demanding and "capricious" in terms of the choice of equipment, lens, the correctness of the camera setting. But, being brought to mind, shows steadily reliable results and is little dependent on weather conditions.

For greater reliability, you can recommend to navigate the value of the number of license plate in 150 pix. And if you remember that the width of the license plate according to GOST is half a meter (520mm to be accurate), then we come to the desired resolution of 300 points per meter.

The linear resolution of pixels on the meter depends on the viewing angle and the permission of the camera matrix. It is possible to calculate it by the formula:

R Lin. - linear resolution, pixels per meter

R H. - horizontal resolution of the chamber (for example,R H. =1080)

α - camera viewing angle

L. - distance from the camera to the object

You can also use our online calculator on the product of the goods you are interested in, on the "What I See" tab.

Below is (for example) several variants of IP video surveillance cameras indicating the maximum distance with which the license plate recognition is possible (the width of the license plate 150 pix). Note, the maximum value of the focal length was used for cameras with a varifocal lens

Focal length

Horizontal resolution

Max. Distance, M.

Max. Width of the review, m

1920 pixels

1280 pixels

2688 pixels

2048 pixels

2048 pixels

It is important to understand that higher resolution cameras may monitor broader zones, so there are fewer them on the same site. In this case, the linear resolution remains within the limits of identification requirements. This fact Makes economically reasonable use of high-resolution chambers in many situations.

Lightness sensitivity and shutter speed

For confident recognition of automotive registration signs, the camera should have good photosensitivity and the possibility of manual installation of the shutter speed (shutter speed or simply excerpts). This requirement is extremely important when constructing car recognition systems moving at high speed. For cars moving at speeds up to 30 km / h (namely, we, as a rule, we are implementing for our customers: cottage settlements, residential complexes, parking lots of shopping center, various closed territories) This requirement is less important, but it is impossible to underestimate it, because To achieve high quality recognition, the camera should remove at least ten frames with the readable number.
Therefore, for example, to recognize the number of a / m, moving at a speed of 30 km / h at the corner of the chamber installation of up to 10 degrees relative to the axis of the movement, the shutter speed should be about 1/200 second. For many inexpensive cameras, such an excerpt Even in the afternoon under cloudy weather, it may be insufficient, and the picture will be dark and / or awesome. Therefore, it is worth paying close attention to the size of the matrix and its quality. Ideally use a specialized black and white chamber with a CCD matrix. However, the price is very high and the resolution is usually not more than 1MM, which imposes serious restrictions on their applicability.
In general, it should not be chased in high resolution, if there are no objective reasons. The relatively inexpensive ultra-high resolution cameras (4MP, 5MP and higher) are built on matrices 1/3, 1 / 2.8 and, less often 1 / 2.5 inches. The same size of the matrix also have cameras with a resolution of 1.3 and 2MM. As a result, the size of each photosensitive element in the chamber 1.3MM is significantly more than in the 5MP chamber, and the greater size - the more light can collect each photosensitive element. That is why recommended by us for the IP camera recognition tasks rarely have permission more than 2MP.

Wide dynamic range (WDR), compensating background illumination

The dynamic range of the chamber determines the ratio between the maximum and minimum light intensity that can normally fix its sensor. In other words, this is the ability of the camera to transfer without distortion and losses at the same time and brightly illuminated and dark areas of the image. This parameter is very important when auto recognition of numbers, because It helps to fight with a light champ chamber. However, even the most advanced cameras with WDR in 140db are not always able to cope with high-contrast lighting. In this case is set additional lighting visible light or operating in the IR range, the illuminating zone in which the number recognition occurs.

Depth of field

The depth of field, or, completely, the depth of the sharpness of the image (flu) is called the range of distances on which objects are perceived as sharp.

This parameter is determined by the focal length, the diaphragm and the distance to the object. The greater the depth of field - the more focusing zone and the more opportunities to "catch" a sufficient number of clear frames of a moving car.

Perhaps the maximum effect on the depth of sharpness has a lens diaphragm. The smaller the hole of the diaphragm - the greater the depth of the sharpness, the more - the topics of the flu. All cameras recommended for the recognition of numbers are able to adapt to changing the lighting conditions due to the automatic change of the diaphragm. Adjusting the focus of such chambers is recommended to be made at the highest possible diaphragm when the depth of sharpness is minimal.

The greater the distance from the camera to the object, the depth of the sharpness is greater, so it is not necessary to strive to place the camera as close as possible to the recognition zone. On the other hand, the focal length is longer, the depth of sharpness is less. In our practice, the optimal distance from the camera to AM is in the range from 6 to 10 meters. Although it is not impossible and recognition from a distance and 100 meters.

Distortion

Many lenses distort a little image. Most often occurs the so-called "barrel-shaped" distortion of the picture. This is associated with an increase that is greater in the center and less at the edges, which leads to a change in the size of the object. So, if the same object falls into the center of the image and on its edge - its sizes on the edge will seem smaller. This may affect the ability to identify.

The shorter the focal length is the stronger the distortion may be noticeable. Therefore, cameras with wide-angle lenses (less than 4mm) to identify it is undesirable.

Noise and color reproduction

The less noise and the more accurate color reproduction - the better for identification. Therefore, it is recommended to pay attention to such parameters as the minimum illumination of the chamber, as well as the presence of noise reduction functions.
Suppression of noise is particularly relevant with insufficient illumination when the camera sensors are strongly "noisy", which complicates identification. It should be understood that in many cases noise reduction and other electronic "bulls" cannot cope, and it is necessary to provide a sufficient level of lighting at the facility.

Compression video

Modern IP cameras transmit a compressed video signal, and if there is no movement in the frame or it is minimal - traffic will be small. If the movement in the frame intensive - traffic will grow. Therefore, in the event of a constant bitrate in the settings in the settings, the picture will be suitable for identification in the absence of movement, but unsuitable - with intensive motion in the frame.
To identify it is recommended to set a variable bitrate with the most high levels Quality. In this case, the desired image quality will be provided.


Matrix: 1 / 2.8 "Progressive Scan CMOS

Hardware WDR 140DB.
Lens: 2.8-12 mm
Features: The inner camera, for installation on the street is needed thermocouples. The lens is not included and purchased separately


Max. resolution: 1,3mp, 1280 x 960 Pix
Hardware WDR
Lens: 2.8-12 mm
Street 2 MP Network Camera AXIS P1365-E C WDR and LIGHTFINDER

Matrix: 1 / 2.8 "Progressive Scan CMOS
Max. Resolution: 2MP, 1920 x 1080 Pix
Hardware WDR
Technology LightFinder.
Lens: 2.8-8 mm @ F1.3
Features: High Sensitivity, Autofocus

Dahua IPC-HF8301E utlra WDR 120DB, Ultra 3DNR

Matrix: 1/3 "Progressive Scan CMOS
Max. Resolution: 3MP, 2048x1536 Pix
Hardware WDR
Lens: 2.8-12 mm
Features: The inner camera, for installation on the street is needed thermocouples. The lens is not included and purchased separately


Matrix: 1/3 "Progressive Scan CMOS
Max. Resolution: 1,3mp, 1280x960 Pix
Lens: 2.8 - 8 mm (F1.2)
Features: High Sensitivity, Autofocus

Modern video surveillance allows you to collect information about the flow of car traffic and pedestrians, and also provides various features of video analytics.

The functions of determining the number of visitors, identifying persons, became in demand among private organizations and entrepreneurs.

Consider a detailed important function of determining license plates. Video surveillance systems can be combined with access control system. The camcorder determines the number, and the analytics system is looking for a coincidence in the list of database numbers and if available, gives permission to the access control system for the entry of the vehicle.

When planning the installation of the video surveillance system, you need to separate the task of determining the numbers from the observation function of transport and pedestrians. The camcorder to recognize license plates have limitations to the installation site, as well as they need a special setting. The camera should be focused only on the site where they pass vehicles. Therefore, it is better to install cameras that have a fixed lens. They have an additional advantage in the characteristics of photosensitivity.

Camera resolution

High resolution of the camera does not yet mean the quality execution of the task to recognize the numbers. Calculated optimal resolution can even give best result. The higher the resolution, the worse the photosensitivity, and this worsens the definition of numbers with poor lighting.

When calculating the required resolution, the following formula is used: (W / N) * P, where W is the width of the inspection of a fixed license plate; n - number of license plate; P - The proposed width of the displayed number, measured in pixels.

Consider the calculation on the following example: the average size of the sign is 0.52 m, the width of the controlled zone is 3 m, and the recommended size is usually taken in 200 pixels. We get such an answer:

(W / N) * P \u003d (3 / 0.52) * 200 \u003d 1154 pixels.

Calculation shows that suitable option There will be a camera with a standard HD shooting format (1280 * 720 pixels). But this is true if the distance from the camera to the number is 3-5 meters. If the distance is larger, then the camera resolution is necessary above. If this distance exceeds 20m, then the camera with a variofocal lens is necessary. It will allow to narrow the angle of the review, thereby increasing the fixed object on the monitor screen.

Characteristics of the video cameras for recognition of numbers

It is necessary to take into account the size of the matrix itself. Large Matitsa has greater photosensitivity. To recognize the numbers, the matrix should be at least 1/3 inches. But for a qualitative definition of numbers, a matrix is \u200b\u200brequired from 1/2 inches and more. For example, an IP camera with a Sony IMX 185 matrix size 1 / 1.8.

The characteristics of the Lights are equally important. This indicator defines the lens of the camcorder and is denoted as the number F. It is characterized by the ratio of the focal length to the disclosure value of the diaphragm. Characteristic Signal / Noise will be better with a greater luminosity, as more light comes on the matrix. With increasing light, the number of digital noise is reduced. The definition of numbers requires the value of the lights from F / 1.4 and higher.

Even the very the best cameras Unable to determine the number of the car located in complete darkness. Therefore, you need to immediately take care of normal lighting. Most of the modern chambers have an IR illumination, but this feature is forced to switch to black and white shooting mode. When IR illumination occurs additional heating of the camera, which may cause overheating in the hot season, and this will create extra interference.

It matters and indicator of the number of frames per second. Cameras are recommended with a frequency of 25 k / s. In areas with a low speed of movement of the transport flow, the camcorder is switched to 12 to / s or below. This allows you to reduce the load on the device to better process the incoming information.

Arrangement of the camcorder

For the expected result, the equipment needs to be placed with a clear observance of all conditions.

  • In the image, the tilt number of the car should not be more than 5 ° along the x axis.
  • The angle of direction of the chamber must be up to 30 ° as horizontally and vertically.
  • To capture 2 bands, you can install the chamber in the center between them.
  • The height of the camera must be within 2-6 meters.
  • When installing the device near the barrier, it must be borne in mind that it creates a certain portion of the alienation.
  • Having installed the camera, it is necessary to check the admissibility of the quality of shooting at night. The diaphragm mode is set on "Auto" with a level 50.
  • To clean the light headlights in a dark period, a camera with an excerpt is 1/1000 or more.
  • In the absence of normal lighting, the road should be set a day / night function on "Auto". Otherwise, the intellectual illumination is set to position - "On".
  • The illumination BLC and WDR must be turned off.

To automatically record numbers in the database, a special camera program is needed or a PC recognizing license plates. Now come on sale and cameras that themselves recognize car numbers.

It is time to tell in detail how our implementation of the algorithm of the recognition of numbers works: which turned out to be a good solution that it worked very perfectly. And just to report in front of the habra-users - after all, you use the Android application Recognitor helped us gain a decent size of the database of snapshots of numbers, removed completely unbounded, without explaining how to shoot, and how not. And the database of pictures when developing recognizing algorithms is the most important!

What happened with Android app Recognitor
It was very nice that Habra users took to download the app, try it and send us numbers.


Downloads Programs and Evaluation

From the moment of posting the application, 3,800 snapshots of the numbers from the mobile application came to the server.
And even more we were pleased with the link http://21/116/121.70:10000/uploadimage - for 2 days we sent about 8 thousand full-size pictures of car numbers (mostly Vologda)! The server was almost lying.

Now we have a base of 12,000 photos of photographs - ahead of the gigantic work on debugging algorithms. All the most interesting just begins!

Let me remind you that the Android application has previously highlighted the number. In this article, I will not stop in detail at this stage. In our case, the Haar Cascading Detector. This detector does not always work if the number in the frame is very turned. Analysis of how the trained cascade detector works us when it does not work, leave the following articles. This is really very interesting. It seems that this is a black box - here they trained the detector and nothing else to do. In fact, it is not.

But still a cascading detector is a good option in case of limited computing resources. If the car number is dirty or the frame is poorly visible, then the Haar also shows itself with respect to other methods.

Room recognition

Here is a story about the recognition of text in pictures of this type:


General approaches about recognition were described in the first article.

Initially, we set ourselves the task of recognizing dirty, partially broken and well distorted perspective of numbers.
First, it is interesting, and secondly, it seemed that then clean would be able to work in 100% cases. Usually, of course, it happens. But here it did not work out. It turned out that if for dirty numbers the probability of success was 88%, then in pure, for example, 90%. Although in fact the probability of recognition from the photo on mobile application Before a successful answer, of course, it turned out to be even worse than the specified figure. A little less than 50% of the incoming images (so that people do not try to take pictures). Those. On average, twice it was necessary to take a picture of the room to recognize it successfully. Although in many respects such a low percentage is associated with the fact that many have tried to shoot numbers from the monitor screen, and not in a real setting.

The entire algorithm was built for dirty numbers. But it turned out that now in the summer in Moscow 9 out of 10 rooms are perfectly clean. So it is better to change the strategy and make two separate algorithms. If you managed to quickly and securely recognize a clean number, then this result and send the user, and if it failed, we will spend some more processor time and launch the second algorithm for dirty rooms.

A simple algorithm for recognizing numbers that would cost immediately
How to recognize a good and clean number? It is not at all difficult.

We present the following requirements for this algorithm:

1) Some resistance to turns (± 10 degrees)
2) stability to minor scale changes (20%)
3) cutting off any boundaries of the number of the frame boundary or simply poorly pronounced boundaries should not shut everything (it is fundamentally important, because in the case of dirty numbers you have to rely on the border of the number; if the number is clean, then nothing better than the numbers / letters does not characterize number).

So, in clean and well-readable rooms, all the numbers and letters are separated from each other, which means you can binarize the image and morphological methods or allocate related areas, or use the well-known frequencies of contour.

Binarize frame

It is still worth walking the medium frequencies filter and normalize the image.


The image shows an initially small frame for clarity.

Then binarisy at a fixed threshold (you can fix the threshold, since the image was normalized).

Hypothesis by turning frame

Suppose a few possible corners of the image turns. For example, +10, 0, -10 degrees:

In the future, the method will have a small resistance to the corner of the rotation of numbers and letters, so such a rather large step in the corner is 10 degrees.
With each person in the future we will work independently. What kind of hypothesis on the turn will give the best result, the one will win.

And then collect all related areas. The standard function was used here. findContours. From OpenCV. If the associated area (contour) has a height in pixels from H1 to H2 and the width and height is associated with the ratio from K1 to the K2, then we leave it in the frame and note that in this area there may be a sign. Almost probably at this stage only the numbers and letters, the rest of the garbage from the frame will leave. Take the limiting contours of rectangles, we give them to one scale and then we work with each letter / digit separately.

These are the limiting contour rectangles satisfied our requirements:

Letters / numbers

The quality of the picture is good, all letters and numbers are perfectly separated, otherwise we would not reach this step.
Scaling all signs to one size, for example, 20x30 pixels. Here they are:

By the way, OpenCV when performing Resize (when the size of 20x30 is performed), the binary image will turn into a gradient, due to the interpolation. We have to repeat binization.

And now the easiest way to compare with known images of signs is to use XOR (normalized Hamming Distance). For example, so:

Distance \u003d 1.0 - | sample xor image | / | sample |

If the distance is more threshold, then we believe that we found a sign, less - throw out.

Layout-digit-digit-digit-letter letter

Yes, we are looking for automotive signs of the Russian Federation in this format. Here it is necessary to consider that the figure 0 and the letter "O" are generally not distinguishable from each other, the figure 8 and the letter "B". Let's build all the signs from left to right and we will take 6 characters.
Criterion times - letter-digit-digit-digit-letter letter (do not forget about 0 / o, 8 / c)
Criterion Two - deviation of the lower boundary of 6 characters from the line

Summary glasses for the hypothesis - the sum of the Hamming distance of all 6 characters. The bigger, the better.

So, if the total glasses are less than the threshold, then we believe that we found 6 numbers of the room (without the region). If there is more threshold, then we go to the algorithm sustainable to dirty numbers.

There is also worth viewing separately the letters "H" and "M". To do this, make a separate classifier, for example, by histogram of gradients.

Region

The following two or three signs above the lines spent on the bottom of the 6 characters already found is the region. If the third digit exists, and it looks more threshold, the region consists of three digits. Other than two.

However, the recognition of the region often does not happen so smoothly as I would like. Figures in the regions are less, can not be successfully divided. Therefore, the region is better to know the way more resistant to dirt / noise / overlapping described below.

Some details of the description of the algorithm are not too detailed. Partly due to the fact that only the layout of this algorithm is now made and still be tested and debugged it on those thousand images. If the number is good and clean, then you need to recognize the number or respond to milliseconds and answer the "failed" and go to a more serious algorithm.

Algorithm Resistant to dirty numbers

It is clear that the algorithm described above does not work at all if the signs on the number sticks out due to poor image quality (dirt, poor resolution, unsuccessful shadow or shooting angle).

Here are examples of numbers when the first algorithm could not do anything:

But you have to rely on the boundaries of the car number, and then inside a strictly defined area to search for signs with exactly known orientation and scale. And most importantly - no binarization!

We are looking for the lower border number

The easiest and most reliable stage in this algorithm. We turn over several hypotheses at the angle of rotation and we build for each hypothesis by turning the histogram of brightness of pixels along the horizontal lines for the bottom half of the image:

We choose the maximum gradient and so we determine the angle of inclination and at what level to cut down the number below. Will not forget to improve the contrast and get this image:

In general, it is worth using not only a brightness histogram, but also a histogram of dispersion, a histogram of gradients to increase the reliability of the trimming of the room.

We are looking for the upper border number

It's not so obvious here, if removed from the hands of the rear car number, the upper boundary can be very curved and partially cover signs or in the shade, as in this case:


There is no sharp transition of brightness at the top of the room, and the maximum gradient will reduce the number in the middle.

We got out of the situation not very trivially: they trained for each figure and every letter a Cascading detector Haar, found all signs in the image, so determined the top line where to cut:

It would seem that here and it is worth stop - we have already found numbers and letters! But in fact, of course, the Haar detector can be wrong, and we have 7-8 signs here. A good example of a number 4. If the upper limit of the number is merged with the number 4, it is not at all difficult to see the figure 7. What, by the way, happened in this example. But on the other hand, despite the error in detecting, the upper limit of the rectangles found really coincides with the upper boundary of the car number.

Find side limits

Also, nothing is cunning - absolutely as well as the bottom. The only difference is that the brightness of the gradient of the first or last sign in the room may exceed the brightness of the gradient of the vertical boundary of the number, so not a maximum is selected, but the first gradient exceeding the threshold. Similarly, with the lower boundary, it is necessary to move several hypotheses by inclination, since due to the prospects perpendicularity of the vertical and horizontal boundary is not guaranteed at all.

So, here is a well-clipped number:


Yes! It is especially pleasant to insert a frame with a disgusting number, which was successfully recognized.

One thing is sad - to this stage from 5% to 15% of the numbers can be cut incorrectly. For example, so:

(By the way, this someone sent us a yellow taxi number, as far as I understood - the format is not regular)

All this it was necessary that all this was done only to optimize computations, since it is to move all possible positions, the scale and slopes of signs when they are searching are very expensive computationally.

Divide the string to signs

Unfortunately, due to the prospects and not a standard width of all, they have to somehow allotment symbols in the already cropped number. Here the histogram of brightness will again rescide, but already along the X axis:

The only thing that will continue to explore two hypotheses: symbols begin immediately or one maximum histogram is worth skipping. This is due to the fact that on some numbers a hole for the screw or head of the car number screw can differ as a separate sign, and may be at all invisible.

Symbol recognition

The image is still not binarized, we will use all the information that is.

Here are printed symbols, which means weighted covariance to compare images with an example:

Samples for comparison and weight when covariance:

Of course, it is impossible to simply compare the area allocated using a horizontal histogram with samples. We have to do several hypotheses on the displacement and scale.
The number of hypotheses on the position on the x \u003d 4 axis
The number of hypotheses on the position along the Y \u003d 4 axis
Number of hypotheses on scale \u003d 3

Thus, for each area, when compared with one sign, you must calculate 4x4x3 covariances.

First of all we find 3 big numbers. It is 3 x 10 x 4 x 4 x 3 \u003d 1440 comparisons.

Then left one letter and on the right two more. Letters for comparison 12. Then the amount of comparisons is 3x12x4x4x3 \u003d 1728

When we have 6 characters, then everything is right from them - the region.

There may be 2 digits or 3 digits in the region - it must be considered. To split the region with a histogram in the way it is already meaningless due to the fact that the image quality may be too low. Therefore, just alternately find the numbers from left to right. We begin from the left upper angle, you need several hypotheses along the X axis, Y axis and scale. We find the best coincidence. We are shifted to the specified value to the right, we are looking again. We will look for the third symbol to the left of the first and to the right of the second, if the measure of the look of the third symbol is more threshold, then we were lucky - the region's room consists of three digits.

conclusions
The practice of applying the algorithm (the second described in the article) once again confirmed the registration truth in solving the tasks of recognition: a really presentative base is needed when creating algorithms. We aimed on dirty and felling numbers, because The test base was filmed in winter. And really often quite bad numbers managed to recognize, but there were almost no clean numbers in the training sample.

The other side of the medal was revealed: little is so annoying the user as a situation where the automatic system does not solve a completely primitive task. "Well, what can not be read here?!" And the fact that the automatic system could not recognize the dirty or shabby number is expected.

Frankly, this is our first experience in developing a recognition system for a mass consumer. And about such "trifles" as users, it is worth learn to think. Now a specialist who has developed a similar "Recognitor" program under iOS joined us. In UI, the user has the opportunity to see what is now sent to the server, choose which of the numbered characters the necessary, it is possible to highlight the necessary area in the already "frozen" frame. And it is more convenient to use this. Automatic recognition becomes not a stupid function, without which it is impossible to do anything, but just an assistant.

Think over the system in which the automatic recognition of the image will be harmonious and convenient to the user - it turned out to be a task not easier than creating these recognition algorithms.

And, of course, I hope that the article will be useful.