The Race between Voice and Images for Search: Who Will Win?

Both visual search and voice search have entered the mainstream. But who is the frontrunner in the race to becoming the most dominant way to search?

A quick Google search of the terms “voice search” and “visual search” reveals how popular both of these technologies currently are. Every few days there’s a new article on how voice and images revolutionize the way we search. Firms like Gartner purport the profitable future of both voice and image search: They state that, by 2021, brands that redesign their websites to support visual and voice search will increase digital commerce revenue by 30%. As is well documented on the FASHWELL blog, nowadays all the biggest companies are offering a type of visual search. And all of them have also developed voice search technology that has passed the 95% accuracy threshold.

Voice Search vs. Visual Search: Who’s Doing What

Already back in 2016, Mary Meeker predicted the surge of Google voice searches in her annual Internet Trends report. Two years later, Alpine AI estimates that one billion voice searches happen per month. Its common usage is no surprise: Voice-activated technologies are incredibly convenient and easy to use. Humans speak faster than they type. Talking to voice-enabled devices to schedule appointments or placing search queries saves times and feels natural.

Users can now multi-task or do things hands-free. Amazon has entered the voice market with its Echo home assistants. Google has Google Home and Apple sells homepod devices. And of course all of them have voice-enabled technology on their mobile devices. Next to the bigger tech companies, there are many smaller voice engine companies on the market that specialize in particular industries, like Voysis, Mindori or Sayspring.

The image recognition market is equally profitable, and is estimated to grow to $38.92 billion by 2021. The big players in tech (Amazon, Google, Microsoft) are already active here, all offering their own version of visual search. Here too there are countless companies developing image recognition software for all types of markets. It’s already widely used in healthcare and the self-driving car industry. Especially in retail and eCommerce, image recognition and visual search has a strong use case. Everyone from Zalando, Asos, Wayfair to Pinterest and Snapchat have integrated visual search tools.

Where Visual Search Has The Edge Over Voice

A recent survey found that 5% of consumers use voice search on their mobile devices to make purchases online. However, it’s doubtful if that number will grow much. Since mobile commerce will soon overtake regular eCommerce, having a quick and easy tool to help find certain products is key in this age of instant gratification. Although the pros and cons of voice-activated search and visual search are evenly divided, when it comes to online commerce, visual search reigns supreme.


Take Amazon’s Alexa & Echo devices as an example. By 2020, there will be around 21 million smart home speakers in the U.S., helping Americans to complete routine tasks like booking appointments or writing shopping lists. However, when it comes to questions of style and helping users shop for new clothes or put together looks, Amazon relies on Echo Look. Look uses a camera to analyze a user’s clothes and make fashion recommendations based on those images. Using voice search to help a style problem would be absolutely fruitless. Because, just like text, voice search requires the user to find the correct words to describe a product. And as we have seen, text search really fails for fashion. Visual search on the other hand uses machine learning based image recognition technology to analyze an image and understand its contents. Voice search can’t even compare to visual search in this medium.

Both are cool, but Visual Search is better for eCommerce

It boils down to this: voice search is a helpful piece of technology that simplifies a lot of daily tasks and routines. However, when it comes to shopping for fashion or furniture online, images are still king.