How do I... - Tuesday 26 September 2023
Searching for information with AI (Part 2): conversational agents
(Part 1: recommendations and search engines)
Coming to the spotlight at the end of 2022 with the general public release of ChatGPT3.5, then 4 in spring 2023, conversational agents are tools based on the analysis of large corpora of documents and statistical probabilities for text generation (GPT stands for Generative Pre-trained Transformer). The term generative AI is used more generally, particularly for content other than text: computer code, mathematical formulas, images, videos, etc.
Given the tool's versatility in processing text, it can be used at "every stage of the intelligence process, from identifying needs, to sourcing, to analysis." For example, through semantic analysis, the tool can summarize one or more texts and extract the key ideas to generate keywords, which can then be used to search for other documents. The tool can also lay the foundations for market research by locating competitors, identifying their characteristics, drawing up profiles, etc.
To complete the picture, here's an overview of other tools derived from or inspired by ChatGPT that can be useful for finding information, in the form of web applications or browser extensions:
- Youtubesummary with chatGPT to transcribe and summarize Youtube videos
- Gimme Summary AI to summarize a web page
- LegiGPT for basic legal information (in French)
The ChaptGPT 4 tool is available free of charge, but only after creating a Microsoft account, via the Bing Chat tool: go to bing.com/new with the Edge browser.
Querying them using prompts
Conversational agents answer queries using prompts, i.e. instructions formulated as precisely as possible. The more precisely the context of the information query is defined in the instructions given to the AI, the more relevant its response will be. For example, specify the profile of the person requesting the information, and their level of knowledge and understanding of the subject. Indicate the desired output format, to save time when formatting the information.
Using generative A.I. to "produce" information
In addition to information retrieval in the form of text, figures or images, so-called generative AI can of course also be used to produce content of the same type. This practice raises a number of ethical issues, the most obvious of which relate to intellectual property and academic integrity. While it is accepted, for example, that AI cannot be cited as an author in the true sense of the term, this does not mean that it cannot be cited as a writing aid for a dissertation, article, thesis, etc. The scientific and technical information department of CIRAD (Centre de coopération internationale en recherche agronomique pour le développement) makes this clear:
“Le Committee on Publication Ethics (COPE) a rédigé une note de position (Authorship and AI tools - COPE position statement, version 1, 13/02/2023) soulignant que ces outils ne sont pas assimilables à des auteurs, que ces derniers doivent décrire comment ils les emploient et qu’ils restent responsables du contenu de leur publication, quelle que soit la manière dont ils l’ont produit.”
Images
Numerous copyright issues potentially arise, but the framework is evolving: there is no person in the legal sense who creates content, and therefore no author, but neither are there any commercial uses made possible by a free license or assignment of rights. The use of AI-generated content should therefore initially be tolerated on a not-for-profit basis, unless the tool specifies otherwise, particularly in the case of paid professional use. There is also a potential problem of cascading plagiarism: the tool may infringe copyright by "training" itself on a corpus of protected works, whose features it then reproduces too distinctly, by providing its users with images that amount to "style theft".
Texts (including translations)
A good practice is not to use a text generator without mentioning it, or in excessive quantities, as this would violate academic ethics, as seen above. This also means mentioning the use of a machine translation tool, even for text that has been edited.
Text-to-speech
This use of artificial intelligence for automatic language analysis makes it possible to switch from written to spoken text and vice versa, and can be useful both for transcribing interviews and for facilitating access to various textual contents. However, any text copied into a free web-based tool such as TTSreader may be reused by the tool's designers to improve it, so you need to take care to protect personal and sensitive data. Be sure to read the terms and conditions of each tool carefully.
Some ethical issues
Beyond potential infringements of copyright or academic ethics, text-generating tools pose problems in terms of the quality of the information provided. For example, ChatGPT does not mention its sources in its answers, while basing itself on a limited corpus that has not been updated beyond 2021. It can "invent facts", enabling ill-intentioned people to spread disinformation under an air of accomplished veracity. The ESSEC Learning Center is here to help you deal with these pitfalls.
The search for and production of information using tools based on artificial intelligence is indeed a rapidly evolving field, to be monitored as much from the point of view of tools and practices as from that of regulations and social developments in the broadest sense.
For further information, Semantic Scholar and Typeset.io, are two AI-assisted search tools that can be used for keyword formulation and results analysis.
Sources
- Eugène, Matthieu. “ChatGPT : notre guide pour créer les meilleurs prompts.” BDM, April 17, 2023.
- Fovet-Rabot, Cécile, and Marie-Claude Deboin. “Définir Les Auteurs d’une Publication Scientifique.” CIRAD, 2014.
- Khan, Lina M. “Opinion | Lina Khan: We Must Regulate A.I. Here’s How.” The New York Times, May 3, 2023, sec. Opinion.