Sept. 27, 2022
Challenges for Organizations Building Computer Vision AI Models
Computer vision is one of the most exciting subsets of artificial intelligence and machine learning. We’re already seeing a wealth of real-world use cases in industries from manufacturing to agriculture to retail. The possibilities that computer vision offers span essentially every single industry and field and are growing rapidly.
A machine enabled with computer vision can process images and video extremely quick. It has the potential to efficiently and accurately detect defects, count items, notice changes, and more. The end result: quality and potential for safer operations and lower costs, regardless of the field.
What Is Computer Vision?
Computer vision is not just image recognition, although that is a crucial step in the process. What makes computer vision truly intelligent is the ability to provide recommendations or make decisions based on unseen images. Software containing trained AI models can process images and take action very quickly.
What can computer vision models do?
- Image segmentation
- Image classification
- Facial recognition
- Feature matching
- Pattern detection
- Object detection
- Edge detection
Computer Vision and Narrow AI
Note that most computer vision applications are used for very specific tasks. Called “narrow AI,” this is where researchers and developers have made most progress in the field. Computers can get very good at detecting, recognizing, and classifying the images they’ve been trained to understand. However, this “knowledge” is difficult to transfer to other inputs and tasks.
While researchers continue moving toward artificial general intelligence, in the meantime, specific AI through the use of computer vision is becoming more and more accessible each year.
Building a Working Computer Vision AI Model
Once you have your specific business problem that the computer vision model will solve, there are a few broad steps to follow.
- Decide upon the right training data to use, and then collect it and label it.
- Train your model to correctly interpret the test images. Evaluate.
- Deploy and test your model using similar images it has not previously been exposed to.
- Iterate until the solution is consistently accurate.
Three Key Challenges in Building and Implementing Computer Vision AI
Computer vision AI has the potential to execute repetitive and monotonous tasks more quickly than humans can. Its assistance with defect detection can lead to better products and lower costs. It can even become more accurate than the human eye. However, successfully implementing computer vision remains challenging for organizations. Below, we’ll discuss some common challenges and some ways we propose to mitigate them.
First Challenge: Access to Data Scientists
Data scientists are essential in determining how data can be used to solve business problems, gathering and analyzing data, designing data modeling processes, and creating algorithms. Computer vision models depend on vast amounts of data to learn accurately, making this role indispensable.
However, many, if not most, organizations do not have dedicated data scientist resources. Even if they do, these skills are often in demand, and resources may be limited or stretched thin due to the labor-intensive nature of traditional AI model-building.
Organizations that don’t have in-house data scientists can face an uphill battle finding the right talent with the right skill-sets. Good data scientists are hard to find for two reasons. For one, artificial intelligence is becoming increasingly accessible, increasing demand for these professionals. For another, the wide variety of technical skills required makes this a challenging field to get into, leading to a limited talent pool.
Some of the technical skills required are:
- Working with a plethora of different frameworks, programming languages, and complex tools.
- Combining different algorithms and choosing the best algorithm for their purposes.
- Creating predictive models.
- Data wrangling and data visualization.
- Working with different sizes and shapes of datasets.
Additionally, good data scientists will have enough business acumen to understand business problems and determine the right data to use when training the computer vision model.
Training Engineers in Data Science
In the face of these computer vision challenges, some organizations look inward for data science talent. It is possible to train your own professionals to build computer vision models, but it does take time and resources. Software engineers with an inclination toward artificial intelligence or machine learning make good candidates for learning data science in-house.
However, the lack of user-friendly tools and processes makes this a difficult path to tread. In addition to the processes themselves, the dependencies of the tools, frameworks, and libraries must be managed effectively. For this reason alone, many end up looking for data scientists who can already navigate the available tools. If you could find more user-friendly solutions that required little to no code, it would be possible to build computer vision models in a simpler way. In this case, it would be more feasible to train your own data scientists in the other skills needed without the platform as a barrier.
Second Challenge: Collaboration Between Data Scientists and Domain Experts
Domain experts are professionals with highly specialized knowledge pertaining to a specific business or technical area, process, method, or practice. This expertise makes them invaluable for organizations looking to solve challenging problems and develop more efficient or effective solutions.
Data science experts know the technology needed to train the computer vision model. They understand working with large data sets and how to visualize the data. In many cases, domain experts work closely with the use case for that particular model, as well as the nuances of that particular business problem. While domain experts are experts on their own subject, they don’t necessarily have extensive knowledge about artificial intelligence, let alone computer vision.
The Benefits of Collaboration Between Data Science and Domain Experts
Ideally, both data scientists and domain experts will collaborate to build an accurate model. This collaboration makes the AI model training process much more accurate and faster since domain experts are familiar with the data for their particular use-case. Domain experts also play an important part in testing. They can analyze results to see if the model is really working as needed or if further adjustments or a whole new direction are required. Additionally, increased collaboration with domain experts can ease pressure on often limited artificial intelligence and data science resources.
The knowledge transfer goes both ways. Just as domain experts inform the training and testing process, data scientists help them understand the solution they are creating. A good understanding of the technology allows domain experts to best apply the solution to their business needs.
The Breach Between Data Science and Domain Expertise
However, despite the benefits of collaboration, it doesn’t always happen as much as it should. For one thing, data science and machine learning professionals use different kinds of tools than domain experts. They may also have different levels of expertise. Both of these fields are complex and take years of experience to learn the respective skills.
It’s also incredibly challenging to be both a data science expert and domain expert. Both fields require a huge commitment in order to stay abreast of the latest developments–a task too great for most bandwidth-constrained professionals.
Third Challenge: Time-Consuming, Labor-Intensive, and Data-Intensive Processes
There is simply a huge amount of time and effort required to build, train, and test a successful computer vision AI model.
Gathering, Organizing, and Labeling Data
No matter how simple the intended results, computer vision models need a large amount of data in order to be trained. Choosing the right kind of data isn’t enough. Typically every piece of data used in training must be labeled manually so that the model can build a foundation for understanding what it is “seeing.”
For computer vision, this usually means labeling key points, pixels, or images, or creating the bounding box that encloses the digital space you want the computer to look at. Some sources say 80% of an AI project will be dedicated to collecting, organizing, and labeling data.
Testing the Computer Vision Model
Accuracy and bias testing also take time. It’s becoming clear that more testing is needed to really know if an AI model is giving the intended results for the right reasons or not. Underspecification is always a risk when building computer vision AI models. This happens when there are many ways for a model to perform a certain way on a test set, meaning the observed effects could have a variety of possible causes.
This is one common explanation for why many models seem to test well but don’t work in the real world. According to Google, underspecification happens a lot more than we’d like. Even more stress tests on a large number of examples are needed to mitigate this challenge–something that takes even more time.
Challenges When Iterating
Whenever a model shows less-than-accurate results, iteration is necessary. The labor-intensive nature of data wrangling and model testing is exacerbated in traditional computer vision AI models by the need to repeat a lot of the process steps to retrain the model each time the algorithm needs to be adjusted or new data sources emerge.
Organizations can alleviate the time, effort, and data burden by selecting the right training data during annotation to reduce unintentional bias and the need for retraining. Companies can also outsource annotation if the time and labor needed are too much to be handled internally. However, this solution comes with its own challenges, including cost and expected quality.
What Do These Challenges Imply for Building Computer Vision Models
All of these challenges highlight why it is so necessary for humans to remain involved as much as possible in artificial intelligence-based solutions. At the end of the day, AI and machine learning is best used to enhance human expertise, not replace it.
The new Intel® Geti™ platform offers an easy-to-use computer vision AI solution allowing users to label, train, and optimize AI machine learning models.
The Intel® Geti™ platform reduces the time needed to build models by eliminating the complexities of model development and harnessing greater collaboration between teams.
- Mastering the Intel® Geti™ SDK in 9 Steps: A Beginner’s Guide
- The Next Evolution: Intel® Geti™ 1.8.0 is here
- Interactive Annotation with SAM – speeding up the time to model
- Computer Vision Task Overview and Applications
- The Intel® Geti™ SDK: A Game-Changer for Rapid AI Model Development and Deployment in your production system
- Enhance your experience: Introducing Intel® Geti™ 1.5.0
- Efficient, custom object detection training template made easy
- Streamlining AI’s Path to Production with the Intel® Geti™ Platform
- Intel® Geti™ AI Platform Overview: Learn What Is Under the Hood