Artificial intelligence gets a bad rap as a job killer and human-worker replacer. In some areas this is true, but in others, particularly surrounding how data are cleaned and processed, AI is spearheading new jobs.
Data labeling and annotation is a burgeoning industry born from AI. Unstructured datasets from sources like cameras and social media data or structured sources, like databases, are labeled, marked, colored, or highlighted to show differences, similarities by people. To train a machine to learn what a stop sign is, a person must go into camera footage of a street and mark up all the stop signs in the photo. The machine is then fed data identifying thousands of these images. Overtime the system can more accurately identify what a stop sign is by processing the labeled data. This type of machine learning, where a system gets more accurate by being fed more data, is termed deep learning.
As this process is essential for algorithms to accurately perform core parts of its function, the data labeling industry is set to take off over the next five years. In 2018, the market for AI and machine learning data preparation, a process that relies heavily on people to manually label data, stood at $500 million. According to Cognilytica, that is expected to more than double, reaching $1.2 billion by 2023. Third-party providers an expect to see a significant uptick in that growth, going from $150 million of the market to $1 billion over that same time frame. Data labeling is particularly essential for AI that deals with object and image recognition, autonomous vehicles, and text and image annotation.