Licensing Fees For AI Training Data Is Impractical—AI Tax Is Workable

News Room

Speaking at the Vanderbilt University Music Law Summit, an intellectual property attorney for OpenAI stated that privately negotiating for each protected work used to train a large language model would be an impossibility—and would foreclose the existence of programs like OpenAI’s ChatGPT. In addition to being a useful strategic stance to take as the company faces numerous lawsuits for the impermissible use of copyrighted material, it is probably true.

Happily, there is a policy arrow in our quiver that will do the trick and help ensure society writ large, if not the individual authors themselves, is compensated for the use of the works.

Determining What is Owed Under Traditional Licensing

An artificial intelligence tax, or more specifically a tax on large language models, could be leveraged to compel model developers and owners to internalize the externalities generated by feeding models copyrighted training data.

Traditional intellectual property law paradigms fall down in the face of the “use” of a piece of copyrighted work not being to copy it, per se, but to learn from it. Putting our rational policy caps on, we can envision a scenario in which an AI model is used to make perfect reproductions of pieces of writing it has been exposed to, supplanting demand for licensed copies. In such a situation, there can be little doubt there would be an issue of copyright violation.

However, when the use of the underlying work is to glean from that piece of writing the underlying truths about writing more generally, the waters are muddied. In the most abstract terms, the piece of work is being used for something the author never intended and can only indirectly affect licensed sales of that work.

By way of analogy, imagine an energy reactor that can listen to recorded works of music and generate electricity from those works. Energy companies each need only really buy one copy of a piece of music, if any as music can be “heard” freely over the radio, and can generate from that work massive amounts of electricity. Energy company profits soar.

Few would argue something is owed to the creators of the underlying works—ranging from the heartfelt thanks of a society gifted with free energy to a percentage of profits. The question revolves around calculating what is due: it seems equally improper to key the amounts owed to the amount the music is sold for as to the amount the energy company can sell electricity for. In the former situation, the profit on a piece of music sold is determined in light of the intended use; in the latter, the use arrived at for the piece of music is so fully outside the scope of what the creator could have considered, compensating them based on that unforeseeable use is an unjustifiable windfall.

The same is true of protected works being boiled down and reduced to their core components for use in an AI model—it is separate and apart from the use considerations for which we have traditionally assigned a value to the underlying written work. Put simply, the value in the marketplace as a work of art is immaterial to the value of the work to language models. This can redound to the benefit or detriment of the author in terms of compensation—it may very well be that a best-selling work is the poorer for model training purposes as against the lowliest trade paperback beach read.

An AI Tax Fits the Bill

The rapid introduction and advancement of AI has made clear the limitations of traditional intellectual property laws in addressing the challenges posed by language models. Licensing fees for the use of copyrighted material in training large language models, especially retrospective attempts at licensing following use, present a logistical and financial impossibility.

An AI tax, specifically targeting the use of large language models and levied on the parameter size of a given model—a rough approximation of value extracted from works by way of the variables learned by the model—could offer a structured approach to ensure companies leveraging protected materials contribute back to society and by extension the creative community. Individual licensing agreements are cumbersome and nearly impossible to negotiate at scale, whereas an AI tax could offer a streamlined and scalable solution.

The revenue generated by the tax could support a compensation fund, leveraged towards providing for cultural, educational and technological initiatives. The next best thing to the impractical compensation of individual creators is the funding of the programs and initiatives that creators, past and future, benefit from.

The implementation of an AI tax would also align with the broader tax principle of internalizing externalities—AI companies and developers would be forced to account for the broader impact of their technologies and balance innovation with social responsibility.

Broader AI Tax Policy Considerations

The most equitable and effective method for imposing an AI tax will likely require trial and error. The tax base must be chosen consciously from the scale of the data used by the AI as indicated by the model’s parameters, the revenue generated by the AI applications, or some combination of both. Additional considerations aimed at progressivity, such as a tiered system where tax rate scales with the size of the entity or the volume of data processed, should also be given careful thought.

As with all such tax regimes, the ideal is for the tax to be paid by the entities developing and profiting from AI technologies, rather than allowing said entities to shift the cost down to consumers. Such a goal can only be furthered by strict price-setting policies, which can be justified owing to the collective ownership of the underlying training data—if you’re going to use our collected works to advance your AI, you will be able to be profitable but not beyond a certain point or to the degree that the underlying product becomes unaffordable.

With regards to revenue generated by the tax, the allocation of funds is crucial for the efficacy of the broader tax policy. A transparent process, involving and in conversation with stakeholders such as representatives from the broader creative groups, is necessary.

An AI tax is necessary, on the horizon, and will require careful planning and consensus among stakeholder groups to be effective. It must balance the twin goals of compensating society for the use of copyright materials and encouraging innovation in the sector.

Read the full article here

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *