Results from Literature Review Method
The Literature Review, "x", served as a proof of concept (POC) for testing AI-driven coordination among a group using the Coordination Editor. As technology is progressing rapidly, this POC was completed with the Coordination Editor as work-in-progress platform to test the strategy and identify future developments and plans. Using the Literature Review as an example led to the creation of the Project Management Method use-case.
The goal of the literature review was to evaluate coordination among a team and increase the productivity of performing a literature review by X%. Based on average self-reported productivity savings by the team, the use of AI and the Coordination Editor led to a Y% increase in productivity. Our aim is to significantly improve productivity and utility with AI in future reviews.
The literature review highlighted the importance of a novel role: the AI Business Engineer, designed to enhance coordination and collaboration. This role leverages AI tools to streamline complex processes. AI Business Engineers excel at breaking down concepts for efficient understanding, posing insightful questions to grasp domain intricacies, and identifying optimal AI-driven practices. Scientists reported several benefits from this approach, [specific benefits to be added here].
Visual Canvases emerged as a powerful tool for outlining complex concepts, proving especially valuable when used in conjunction with AI. This approach excelled in breaking down and clarifying technical discussions with scientists, particularly when combined with the analysis of meeting transcripts. By implementing this practice, the team not only fostered more efficient conversations but also created a rich, quarriable knowledge base. This resource provided deeper insights into the scientists' perspectives and thought processes, enhancing overall collaboration, and understanding.
Overall, one of the key values of using the Coordination Editor was the ability to revisit visual step outlines for data extractions and previous writeups. As we developed different narratives and versions of the writeup, this proved more effective than anticipated, especially when combined with AI.
One of the key challenges was the ability to manage citations. While the LLMs could maintain citations to some extent, they became difficult to manage as people edited content and synthesized ideas. We attempted to use AI to find citations after the fact but found the results unreliable. It was decided that scientists would manually confirm and enter citations, as they were already familiar with the sources. The Coordination Editor will enhance this process with metadata, and we will implement PaperQA2 for better citation management going forward, aligning with our existing strategy.
We found that using different tools could sometimes be more effective. For example, the act of copying and pasting information into ChatGPT could spark different trains of thought and offer varied utility. Since we only had GPT-4 currently in the Editor, we used Claude 3.5 Sonnet to assist with writing, as we believed it performed better in this area. We also found it easier to use Microsoft Word for reviewing the writeup with track changes, as people were more accustomed to it. Future versions will evaluate whether the Coordination Editor will be optimal for this part as the platform matures.
We confirmed that Retrieval Augmented Generation (RAG) did not perform consistently well for the literature review. RAG approaches sometimes synthesized well across papers but, depending on the topic, often produced overly generic results. Creating a workflow to extract data from each publication individually worked well. In performing data extractions, we found that striking the right balance between specificity and generality was crucial. Either extreme did not perform well; overfitting to a specific topic would not work well on different publications, while a more general approach did not capture the necessary information. We also discovered that the research topic impacted how we structured the prompt for data extractions. This led to the 'Analysis Template' setup, where scientists could input this information. A baseline was used and then specified different approaches for each section. As with many of our learnings, this is our current approach and may change as technology advances, such as with PaperQA2 advanced RAG.
Last updated