Some weeks ago, in IT Matters I wrote about foundation models, a wholly new approach to Artificial Intelligence (AI). Foundation models have become popular as they blow through the traditional methods of training AI programs with smaller data sets. They were expected to be game changers. Foundation Models have also been called neural networks and ‘generative AI models’. These are new buzzwords and I have come across many startups that want their investors (venture capital firms, angel investors and the like) to believe that they are basing their work on such generative AI models.
The most salient feature of generative models for AI is that they scour almost every shred of information available on the web, a data store that is doubling in size every two years, and then use the output of these to train AI programs to generate output. Going by a two-year doubling rule defined by Live-counter.com, which tries to track of the size of the internet’s data, that figure is close to 80 zettabytes. A zettabyte is a trillion gigabytes.
Open AI, heavily backed by Microsoft, has two such models: one called GPT-3, which is mainly for documents, and another called DALL-E, which focuses on images. GPT-3 analysed thousands of digital books and nearly a trillion words posted on blogs, social media and elsewhere on the internet. Its competitor is Google, whose own offering in generative AI is called BERT.
In contrast, the more focused cognitive models have smaller data sets (some of them even filled with dummy data) that are used to train AI programs on specific use cases. For instance, a medico-radiological system would limit itself to X-rays, magnetic resonance imaging (MRI) scans and other such medical images, and would likely not be training itself on poetry or music and other such information that has no relevance to the task at hand.
The sponsors of generative AI have a lofty aim. The idea is to form a foundation for all sorts of AI applications that can be written on that base. There are several problems with this ‘sledgehammer to kill a fly’ approach, which I touched upon in previous columns. Today, however, I would like to dwell on two opposite views—albeit from different industries—of the effect that generative AI models are having. One thinks they are a flash-in-the-pan, while the other is girding itself for a fight.
OpenAI’s text-to-image creator is called DALL-E. Its best-known rival, a newcomer dubbed Stability AI, has just pulled in $101 million in funding for its Stable Diffusion system, according to Techcrunch.com (tcrn.ch/3DL2TyF). In response to news of this funding, Futurism.com reports that according to Will Manidis, founder and CEO of AI-driven healthcare startup ScienceIO, generative AI is all flash and no substance. While it might be attracting VC cash now, Manidis opines that most ventures will quickly fade into oblivion.
Manidis’ argument centres on text-to-image generators such as DALL-E. He is of the belief that the “creator economy” doesn’t really have much room for growth. Yes, it’s fun and sometimes useful to produce AI-made artworks, but according to him, turning everyone into creators won’t really generate major new revenue streams. He expounded his views on the subject in a Twitter thread on 25 October (bit.ly/3frRfPU).
According to his thread, “Billions of hours of human potential every year are wasted on menial tasks. Data entry, form filling, basic knowledge work kind of stuff,” and these foundation models may have much better uses here (in the base automation of menial tasks) than in more refined use cases such as medical technology, which is what his firm focuses on. One could argue that the argument is a little self-serving, but it’s not without merit. For instance, data entry and expert analysis of medical images are worlds apart.
Meanwhile, it seems as if all is not well for generative AI models at the other end of the spectrum either, with the music industry up in arms against it.
AI generator tools can create brand new music tracks at the click of a button and might be starting to threaten the livelihoods of musicians. This has lobbyist groups deeply concerned. For instance, the Recording Industry Association of America (RIAA) is worried that AI-based music output could threaten the income as well as the rights of human artistes.
The RIAA has a long history in fighting piracy and counterfeiting—first by smashing illegitimate digital copies made on CDs and other media and then in its attempts to iron-clad digital copyrights for musicians. According to an RIAA filing with the Office of the US Trade Representative, the US music industry contributes $170 billion to the US economy, supports 2.47 million jobs and accounts for over 236,000 businesses in the US. For every dollar of direct revenue within the US music industry, the RIAA claims an additional 50 cents is created in adjacent industries. Digital sources for music revenue account for about 90% while physical products such as CDs account for 10%, a clear cross-over made possible by products such as the original Apple iPod and later the smartphone.
The RIAA is now gunning for online services that use generative AI extracts and then remix recordings in the style of well-known human artistes. The lobby group maintains that these services violate copyrights and directly pinch the pockets of its members. Given that the RIAA has been successful in protecting its members in the past, those using generative AI for music production do have cause for worry.
Siddharth Pai is co-founder of Siana Capital, and the author of ‘Techproof Me’.
Download The Mint News App to get Daily Market Updates.
More
Less