Ren Art 21-A to be Art 21-B, add Art 21-A 338 - 338-d, Gen Bus L
 
Enacts the "New York artificial intelligence transparency for journalism act"; requires developers of generative artificial intelligence systems or services to post certain information on the developer's website regarding video, audio, text and data from a covered publication used to train the generative artificial intelligence system or service; grants journalism providers authority to request a subpoena requiring developers to comply with posting such information.
STATE OF NEW YORK
________________________________________________________________________
8595--A
2025-2026 Regular Sessions
IN ASSEMBLY
May 22, 2025
___________
Introduced by M. of A. OTIS -- read once and referred to the Committee
on Science and Technology -- committee discharged, bill amended,
ordered reprinted as amended and recommitted to said committee
AN ACT to amend the general business law, in relation to requiring tran-
sparency from generative artificial intelligence developers for jour-
nalism providers
The People of the State of New York, represented in Senate and Assem-bly, do enact as follows:
1 Section 1. This act shall be known and may be cited as the "New York
2 artificial intelligence transparency for journalism act".
3 § 2. Legislative findings. The legislature hereby finds and declares
4 that:
5 (a) A free and diverse press was critical in the founding of our
6 democracy and continues to be the lifeblood for a functional society;
7 (b) New York has a compelling interest in protecting news publishers
8 and broadcasters that report and distribute news from unfair business
9 practices and competition. Every day, journalism plays an essential role
10 in New York and in local communities, and the ability of local news
11 organizations to continue to provide the public with critical informa-
12 tion about their communities and enabling news publishers and broadcast-
13 ers to receive fair market value for their content that is used by
14 others will preserve and ensure the sustainability of local and diverse
15 news outlets;
16 (c) Communities without newspapers and broadcast news programs lose
17 touch with government, business, education, and neighbors. They operate
18 without journalists working to keep them informed, uncover truth, expose
19 corruption, and share common goals and experiences;
20 (d) Quality journalism is key to sustaining civic society, strengthen-
21 ing communal ties, and providing information at a deep level;
EXPLANATION--Matter in italics (underscored) is new; matter in brackets
[] is old law to be omitted.
LBD13206-02-5
A. 8595--A 2
1 (e) Seventy-three percent of United States adults surveyed said they
2 have confidence in their local newspaper. Broadcasting remains a domi-
3 nant and trusted source of news in communities throughout New York;
4 (f) Studies show that news content comprises a disproportionate amount
5 of generative artificial intelligence training data. News content is
6 especially valuable to artificial intelligence developers because it is
7 high-quality, professional writing created by human beings;
8 (g) After training, generative artificial intelligence systems contin-
9 ue to access news websites, podcasts, broadcasts and digital platforms
10 in order gain access to fact-checked, accurate and up to date content to
11 produce outputs;
12 (h) The vast majority of generative artificial intelligence developers
13 do not obtain permission or compensate news publishers or broadcast news
14 operations for accessing their websites, podcasts, broadcasts and
15 digital platforms for the purposes of building and operationalizing
16 their AI tools and services, in violation of copyright law, those sites'
17 and platforms' terms of service and express prohibitions and prefer-
18 ences;
19 (i) Maximizing the potential of generative AI requires ensuring the
20 sustainability of journalism and the news industry; and
21 (j) News publishers, broadcast news operations and the public deserve
22 to know when generative artificial intelligence developers have accessed
23 news websites and used their work.
24 § 3. Article 21-A of the general business law is renumbered article
25 21-B and a new article 21-A is added to read as follows:
26 ARTICLE 21-A
27 ARTIFICIAL INTELLIGENCE SOURCE DATA TRANSPARENCY
28 Section 338. Definitions.
29 338-a. Artificial intelligence source data transparency.
30 338-b. Enforcement.
31 338-c. Applicability.
32 338-d. Severability.
33 § 338. Definitions. The following terms, whenever used or referred to
34 in this article, shall have the following meanings:
35 1. "Artificial intelligence" means a machine-based system that can,
36 for a given set of human-defined objectives, make predictions, recommen-
37 dations, or decisions influencing real or virtual environments, and that
38 uses machine and human-based inputs to perceive real and virtual envi-
39 ronments, abstract such perceptions into models through analysis in an
40 automated manner, and use model inference to formulate options for
41 information or action.
42 2. "Access" means to obtain, retrieve, acquire, reproduce, crawl,
43 index, or request and receive a transmission of content.
44 3. "Covered publication" means any print, broadcast, broadcast network
45 or digital publication or service which:
46 a. performs a public-information function comparable to that tradi-
47 tionally served by journalism organizations, such as newspapers, broad-
48 cast news operations, broadcast network news operations, magazines and
49 other periodical publications;
50 b. invests substantial expenditure of labor, skill, and money to
51 create, edit, produce, and distribute content including by engaging
52 natural persons to create, edit, produce, and distribute original text,
53 audio, photo, illustrative, or video content concerning matters or
54 topics of interest or use to members of the public through activities
55 such as observation, video recording events, interviews, research, test-
56 ing, and analysis; and
A. 8595--A 3
1 c. publishes new content or updates its content on at least a monthly
2 basis and has a process for error correction and clarification.
3 4. "Crawler" means software that accesses content from a website or
4 other internet source, such as an online crawler, spider, fetcher,
5 client, bot, user agent or equivalent tool.
6 5. "Developer" means a person that designs, codes, produces, or
7 substantially modifies an artificial intelligence system or service for
8 use by members of the public. The term "developer" shall not include
9 artificial intelligence systems used, developed or obtained by a jour-
10 nalism provider for internal use.
11 6. "Generative artificial intelligence" means a class of artificial
12 intelligence models that emulate the structure and characteristics of
13 input data to generate derived synthetic content, including, but not
14 limited to, images, videos, audio, text, and other digital content.
15 7. "Journalism provider" means any person that:
16 a. broadcasts or publishes one or more covered publications; and
17 b. is covered by media liability insurance.
18 8. "Person" means a natural person, corporation, trust, estate, part-
19 nership, incorporated or unincorporated association or any other legal
20 entity.
21 9. "Artificial intelligence utilization" means to use digital content
22 as data to develop the capabilities of a generative artificial intelli-
23 gence system, including through setting or changing its learnable
24 weights and other parameters, and includes, in addition to the initial
25 dataset training, further testing, validating, grounding, or fine tuning
26 by the developer of the artificial intelligence system or service.
27 § 338-a. Artificial intelligence source data transparency. 1. a. On or
28 before January first, two thousand twenty-seven and before each time
29 thereafter that a generative artificial intelligence system or service,
30 or a substantial modification to a generative artificial intelligence
31 system or service released on or after January first, two thousand twen-
32 ty-two, is made publicly available to New Yorkers for use, regardless of
33 whether the system or service is made available for a fee, the developer
34 of the system or service shall post on the developer's internet website
35 the following information regarding video, audio, text and data from a
36 covered publication used to train the generative artificial intelligence
37 system or service:
38 (i) the uniform resource locators or uniform resource identifiers
39 accessed by crawlers deployed by the developer or by third parties on
40 their behalf or from whom they have obtained video, audio, text or data;
41 (ii) a detailed description of the video, audio, text and data from a
42 covered publication used for artificial intelligence utilization,
43 including the type and provenance of the video, audio, text and data and
44 the means by which it was obtained, sufficient to identify individual
45 works;
46 (iii) whether any source identifiers, terms, or copyright notices were
47 removed from the video, audio, text or data; and
48 (iv) the timeframe of data collection.
49 b. The information required to be posted on a developer's internet
50 website pursuant to paragraph a of this subdivision shall not be
51 required where there is an express written agreement authorizing the
52 developer to access the journalism provider's content and the parties
53 agree not to post information relating to the journalism provider's
54 content on the developer's website.
55 2. a. On or before January first, two thousand twenty-seven, the
56 developer of a generative artificial intelligence system or service who
A. 8595--A 4
1 deploys a crawler, either directly or through a third party, in
2 connection with such system or service shall disclose information
3 regarding the identity of crawlers used by the developer or by third
4 parties on the developer's behalf in a manner clearly accessible by a
5 website operator, including but not limited to:
6 (i) the name of the crawler including the crawler's IP address, and
7 specific identifier actually used by the crawler when conducting the
8 crawling activity (such as including the identifiers as part of the user
9 agent or other part of the request headers);
10 (ii) the legal entity responsible for the crawler;
11 (iii) the specific purposes for which each crawler is used;
12 (iv) the legal entities to which operators provide data scraped by the
13 crawlers they operate; and
14 (v) a single point of contact to enable third parties whose websites
15 are accessed by such crawlers to communicate with the developer and to
16 lodge complaints.
17 b. The information disclosed pursuant to paragraph a of this subdivi-
18 sion shall be available on an easily accessible platform and updated at
19 the same time as any change is made to such information.
20 c. The exclusion of a crawler by a website operator shall not nega-
21 tively impact the findability of the website operator's content in a
22 search engine.
23 § 338-b. Enforcement. 1. a. A journalism provider, or a person author-
24 ized to act on a journalism provider's behalf, may request the clerk of
25 the supreme court, or a judge where there is no clerk, to issue a
26 subpoena to a developer of a generative artificial intelligence system
27 that is made available to New Yorkers for use, regardless of whether the
28 system or service is made available for a fee, for disclosure of copies
29 of, or records sufficient to identify with certainty, the text and data
30 used to train the generative artificial intelligence system or service
31 insofar as such text and data pertains to the journalism provider's
32 internet website, broadcasts, podcasts or other digital platforms,
33 including but not limited to:
34 (i) the uniform resource locators accessed by crawlers deployed by
35 developers or by third parties on their behalf or from whom they have
36 obtained text, video, audio or data, and dates and times of collection;
37 and
38 (ii) the text and data used for artificial intelligence utilization,
39 including the type and provenance of the text and data and the means by
40 which such text and data was obtained and when.
41 b. A subpoena issued pursuant to paragraph a of this subdivision may
42 require disclosure of the information required pursuant to paragraph a
43 of this subdivision in the native form in which such information was
44 copied and stored (including all accompanying keys, values, tags, and
45 the like, and any other available metadata), subject to entry of a suit-
46 able protective order in the case that such information constitutes a
47 trade secret of the generative artificial intelligence system developer.
48 c. The developer shall provide the subpoenaed information within thir-
49 ty days of service of the subpoena or, in the case of trade secrets,
50 entry of a suitable protective order. Such subpoena shall be subject to
51 the provisions of article twenty-three of the civil practice law and
52 rules. The court may impose a penalty for failure to respond to such
53 information subpoenas pursuant to section twenty-three hundred eight of
54 the civil practice law and rules.
A. 8595--A 5
1 2. a. A journalism provider may bring an action in the supreme court
2 for an injunction to compel a developer to comply with section three
3 hundred thirty-eight-a of this article.
4 b. If a developer fails to comply with a subpoena issued pursuant to
5 subdivision one of this section, the journalism provider requesting such
6 subpoena may move in the supreme court to compel compliance. If the
7 court finds that the developer did not comply with the subpoena, the
8 court shall order compliance and may impose statutory damages to the
9 journalism provider requesting such subpoena of up to ten thousand
10 dollars.
11 c. If the developer fails to comply with a court order issued pursuant
12 to paragraph b of this subdivision, then the journalism provider may
13 request that the attorney general bring an action on their behalf to
14 ensure compliance with the court order and any statutory damages
15 assessed.
16 § 338-c. Applicability. The provisions of this article shall not be
17 construed to modify, impair, expand, or in any way alter rights pertain-
18 ing to Title 17 of the United States Code or the Lanham Act (15 U.S.C.
19 1051 et seq.).
20 § 338-d. Severability. If any provision of this article or the appli-
21 cation thereof to any person or circumstances is held to be invalid,
22 such invalidity shall not affect other provisions or applications of
23 this article which can be given effect without the invalid provision or
24 application, and to this end the provisions of this article are severa-
25 ble.
26 § 4. This act shall take effect immediately.